> but people should use llama.cpp instead MLX is a lot more performant than Olla...

> but people should use llama.cpp instead

MLX is a lot more performant than Ollama and llama.cpp on Apple Silicon, comparing both peak memory usage + tok/s output.

edit: LM Studio benefits from MLX optimizations when running MLX compatible models.