Devstral (24B)
Best open model for coding agents. Designed for agentic coding workflows. 391K downloads.
Mac mini M4 Pro 24GB with 24GB unified memory is a compact local AI workstation machine. This page lists local AI models that fit its memory budget, with realistic performance expectations for LM Studio and similar runtimes.
For Mac mini M4 Pro 24GB, start with Devstral (24B). Models marked “Comfortable” leave useful memory headroom; “Tight but possible” can work, but you should close other apps and prefer lower quantization.
Best open model for coding agents. Designed for agentic coding workflows. 391K downloads.
The sweet spot. Incredible reasoning, coding and chat quality. The best model you can run on 16GB.
Gemma 4 MoE flagship-for-workstations: 26B total with ~4B active parameters. 256K context and excellent quality-per-watt for local inference. Apache 2.0.
Qwen flagship coding model. Designed for agentic coding with 256K context. Outperforms Claude 3.5 Sonnet on SWE-bench. Apache 2.0.
Zhipu AI's efficient MoE powerhouse. 106B total parameters, only 14B active at inference — dense-model speed with much larger model quality. Clearly the best in the 16–24GB RAM range. Outperforms Llama 3.3 70B. Apache 2.0.
Zyphra's Apache-2.0 reasoning MoE: 8.4B total parameters with only ~760M active, 16 experts, 131K context, Compressed Convolutional Attention and strong math/code benchmarks. Experimental for local use today: currently needs Zyphra vLLM/Transformers forks; LM Studio/GGUF/MLX support is not yet verified.
Zhipu AI lightweight flagship. Strong bilingual CN/EN with hybrid thinking mode, 200K context and tool calling. Apache 2.0 — excellent alternative to Qwen 3.5 9B on modest GPUs.
ServiceNow x NVIDIA mid-size reasoner. Half the memory of 32B reasoners with comparable performance on MBPP, BFCL, GPQA. Strong enterprise fit. MIT licensed.
Zhipu AI's latest flagship. Major upgrade over GLM-4 with enhanced reasoning and coding. Strong bilingual (CN/EN). Ranks #17 on global usage leaderboards. Apache 2.0.
Microsoft Phi-4 reasoning variant. Top choice for 14B reasoning — much better than DeepSeek R1 14B. Rivals larger models on math & logic.
Mistral AI's latest dense 24B. Improved instruction following, function calling, and reduced repetition. Strong European-language support. 128K context. Apache 2.0.
MiniMax's open-source MoE model. Outstanding long-context capabilities up to 200K tokens. Ranks #8 on global usage leaderboards with 23.5B monthly tokens. Apache 2.0.
This page is about local AI fit, not a live price tracker. Prices and availability change. If an Amazon link is present, it may be an affiliate link that supports LocalClaw at no extra cost.