Apple Silicon hardware guide

Best local LLMs for Mac mini M4 Pro 24GB

Mac mini M4 Pro 24GB with 24GB unified memory is a compact local AI workstation machine. This page lists local AI models that fit its memory budget, with realistic performance expectations for LM Studio and similar runtimes.

Chip
M4 Pro
Unified memory
24GB
Compatible models
114
Best pick
Devstral (24B)

Quick answer

For Mac mini M4 Pro 24GB, start with Devstral (24B). Models marked “Comfortable” leave useful memory headroom; “Tight but possible” can work, but you should close other apps and prefer lower quantization.

Mac mini · M4 Pro · 24GB RAM · 512GB SSD · Sweet Spot

Top compatible local LLMs

#1 · Tight but possible

Devstral (24B)

24B · 20GB min · Q4_K_M · 14GB

Best open model for coding agents. Designed for agentic coding workflows. 391K downloads.

codepower
#2 · Good

Qwen 3 (14B)

14B · 16GB min · Q4_K_M · 9.5GB

The sweet spot. Incredible reasoning, coding and chat quality. The best model you can run on 16GB.

chatcodereasoningpowergeneral
#3 · Tight but possible

Gemma 4 26B A4B

26B (A4B active) · 24GB min · Q4_K_M · 16GB

Gemma 4 MoE flagship-for-workstations: 26B total with ~4B active parameters. 256K context and excellent quality-per-watt for local inference. Apache 2.0.

chatcodereasoningpowermultimodal
#4 · Tight but possible

Qwen 3 Coder (30B)

30B · 24GB min · Q4_K_M · 18GB

Qwen flagship coding model. Designed for agentic coding with 256K context. Outperforms Claude 3.5 Sonnet on SWE-bench. Apache 2.0.

codepowerquality
#5 · Good

GLM 4.5 Air (MoE)

106B (14B active, MoE) · 16GB min · Q4_K_M · 9GB

Zhipu AI's efficient MoE powerhouse. 106B total parameters, only 14B active at inference — dense-model speed with much larger model quality. Clearly the best in the 16–24GB RAM range. Outperforms Llama 3.3 70B. Apache 2.0.

chatcodepowerqualitygeneral
#6 · Tight but possible

ZAYA1-8B

8.4B (760M active, MoE) · 24GB min · BF16 (Zyphra fork) · 17GB

Zyphra's Apache-2.0 reasoning MoE: 8.4B total parameters with only ~760M active, 16 experts, 131K context, Compressed Convolutional Attention and strong math/code benchmarks. Experimental for local use today: currently needs Zyphra vLLM/Transformers forks; LM Studio/GGUF/MLX support is not yet verified.

chatcodereasoningmathexperimental
#7 · Comfortable

GLM 4.6 Air (12B)

12B · 12GB min · Q4_K_M · 7.5GB

Zhipu AI lightweight flagship. Strong bilingual CN/EN with hybrid thinking mode, 200K context and tool calling. Apache 2.0 — excellent alternative to Qwen 3.5 9B on modest GPUs.

chatcodereasoningstandardgeneral
#8 · Good

Apriel Nemotron 15B Thinker

15B · 16GB min · Q5_K_M · 9.5GB

ServiceNow x NVIDIA mid-size reasoner. Half the memory of 32B reasoners with comparable performance on MBPP, BFCL, GPQA. Strong enterprise fit. MIT licensed.

reasoningcodepowergeneral
#9 · Tight but possible

GLM 4.7

26B · 24GB min · Q4_K_M · 16GB

Zhipu AI's latest flagship. Major upgrade over GLM-4 with enhanced reasoning and coding. Strong bilingual (CN/EN). Ranks #17 on global usage leaderboards. Apache 2.0.

chatcodepowerqualitygeneral
#10 · Comfortable

Phi-4 Reasoning (14B)

14B · 12GB min · Q5_K_M · 8.5GB

Microsoft Phi-4 reasoning variant. Top choice for 14B reasoning — much better than DeepSeek R1 14B. Rivals larger models on math & logic.

reasoningcodepower
#11 · Tight but possible

Mistral Small 3.2 (24B)

24B · 24GB min · Q5_K_M · 14GB

Mistral AI's latest dense 24B. Improved instruction following, function calling, and reduced repetition. Strong European-language support. 128K context. Apache 2.0.

chatcodepowergeneralreasoning
#12 · Tight but possible

MiniMax M2.1

45B (MoE) · 24GB min · Q4_K_M · 18GB

MiniMax's open-source MoE model. Outstanding long-context capabilities up to 200K tokens. Ranks #8 on global usage leaderboards with 23.5B monthly tokens. Apache 2.0.

chatcodepowerqualitygeneral

Buying note

This page is about local AI fit, not a live price tracker. Prices and availability change. If an Amazon link is present, it may be an affiliate link that supports LocalClaw at no extra cost.