Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> but people should use llama.cpp instead

MLX is a lot more performant than Ollama and llama.cpp on Apple Silicon, comparing both peak memory usage + tok/s output.

edit: LM Studio benefits from MLX optimizations when running MLX compatible models.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: