Qwen 3 (8B)
One of the best 8B models ever made. Thinking mode + lightning fast. The new king of 8B.
MacBook Air M2 8GB with 8GB unified memory is a portable entry-level local AI machine. This page lists local AI models that fit its memory budget, with realistic performance expectations for LM Studio and similar runtimes.
For MacBook Air M2 8GB, start with Qwen 3 (8B). Models marked “Comfortable” leave useful memory headroom; “Tight but possible” can work, but you should close other apps and prefer lower quantization.
One of the best 8B models ever made. Thinking mode + lightning fast. The new king of 8B.
Gemma 4 balanced edge model with strong multimodal quality and 256K context. Great for laptops and high-end mobile devices. Apache 2.0.
Sweet-spot small model. Surprisingly capable for its size with hybrid thinking, 256K context and strong multilingual support. Runs on 8 GB RAM. The go-to for MacBook Air M4 16 GB. Apache 2.0.
Microsoft's latest small miracle. Punches way above its weight in reasoning & code.
Google's multimodal gem. Understands text AND images natively. Great quality-to-size ratio.
DeepSeek's reasoning model distilled to 8B. Shows its thought process step-by-step. Mind-blowing for logic.
Updated R1 reasoning distilled to Qwen3-8B. Improved chain-of-thought with fewer hallucinations vs original R1 distills. MIT licensed.
Alibaba's hybrid-thinking micro-flagship. Toggles between instant answers and deep chain-of-thought reasoning on demand. 128K context, 29 languages, outperforms Qwen3-8B on reasoning benchmarks. Apache 2.0.
⭐ Mac Mini M4 16GB top pick! NVIDIA fine-tune of Llama 3.1. Hybrid /think • /no_think mode — deep reasoning on demand, instant chat otherwise. ~80–120 tok/s on Apple Silicon Metal. 128K context. Apache 2.0.
⭐ Mac Mini M4 16GB top pick! NVIDIA's hybrid model — distilled from 9B, keeps 95% of its quality. Hybrid attention + SSM layers = ~80–120 tok/s on Apple Silicon. Blazing fast, minimal RAM. NVIDIA Open Model License.
The best small Qwen 3.5 for everyday use. Strong reasoning, coding and chat at 9B scale with hybrid thinking mode and 256K context. Runs on 8-16 GB RAM. Great for Mac Mini M4 Pro. Apache 2.0.
IBM enterprise-grade 8B. Trained for RAG, tool-use and structured output. Strong function calling and long-context performance (128K). Apache 2.0 with full data provenance.
This page is about local AI fit, not a live price tracker. Prices and availability change. If an Amazon link is present, it may be an affiliate link that supports LocalClaw at no extra cost.