DeepSeek R1 0528 Distill (8B)
Updated R1 reasoning distilled to Qwen3-8B. Improved chain-of-thought with fewer hallucinations vs original R1 distills. MIT licensed.
A static, Google-indexable guide to the best local AI models that fit in a 8GB RAM budget. Built from the LocalClaw model database and ranked by quality, reasoning, coding and speed.
With 8GB RAM, prioritize models with minimum RAM at or below 8GB and avoid filling memory completely. For most users, start with DeepSeek R1 0528 Distill (8B), then test a faster smaller model if latency matters.
Updated R1 reasoning distilled to Qwen3-8B. Improved chain-of-thought with fewer hallucinations vs original R1 distills. MIT licensed.
Alibaba's hybrid-thinking micro-flagship. Toggles between instant answers and deep chain-of-thought reasoning on demand. 128K context, 29 languages, outperforms Qwen3-8B on reasoning benchmarks. Apache 2.0.
⭐ Mac Mini M4 16GB top pick! NVIDIA fine-tune of Llama 3.1. Hybrid /think • /no_think mode — deep reasoning on demand, instant chat otherwise. ~80–120 tok/s on Apple Silicon Metal. 128K context. Apache 2.0.
One of the best 8B models ever made. Thinking mode + lightning fast. The new king of 8B.
⭐ Mac Mini M4 16GB top pick! NVIDIA's hybrid model — distilled from 9B, keeps 95% of its quality. Hybrid attention + SSM layers = ~80–120 tok/s on Apple Silicon. Blazing fast, minimal RAM. NVIDIA Open Model License.
The best small Qwen 3.5 for everyday use. Strong reasoning, coding and chat at 9B scale with hybrid thinking mode and 256K context. Runs on 8-16 GB RAM. Great for Mac Mini M4 Pro. Apache 2.0.
IBM enterprise-grade 8B. Trained for RAG, tool-use and structured output. Strong function calling and long-context performance (128K). Apache 2.0 with full data provenance.
Gemma 4 balanced edge model with strong multimodal quality and 256K context. Great for laptops and high-end mobile devices. Apache 2.0.
Qwen coding specialist with long context. Great for agentic coding tasks. 477K downloads.
Google on-device powerhouse with vision. Designed for phones/tablets/laptops but punches far above its weight. Per-layer memory management for constrained devices. Apache 2.0.
DeepSeek's reasoning model distilled to 8B. Shows its thought process step-by-step. Mind-blowing for logic.
Hybrid reasoning model outperforming peers. Strong general + reasoning at 8B. 558K downloads.
LG AI Research reasoning model. Strong at math and coding reasoning. 200K downloads.
Alibaba's 18T token trained model. Excellent multilingual and coding. 14.9M downloads. Wide community support.
Shanghai AI Lab multimodal model. Strong vision understanding for documents, charts, and photos. MIT licensed. Note: primarily PyTorch/safetensors — community GGUF may vary.
DeepSeek's reasoning model distilled to 7B. Shows thought process step-by-step. 65.5M downloads total.
Open reasoning model distilled from DeepSeek R1. Excellent chain-of-thought. 601K downloads.
Alibaba open reasoning model. Good chain-of-thought reasoning at 7B. 52K downloads.