Kimi K2 Instruct (1T MoE)
Moonshot AI trillion-parameter MoE flagship. 32B active params per token with 384 experts. Matches or beats GPT-4 Turbo on MMLU, GSM8K, HumanEval. Agentic & tool-use specialist. Server-grade only. Modified MIT.
Best local AI models for coding, repo work, debugging and software engineering. Compare RAM, quality, coding scores and LM Studio setup. Ranked from the LocalClaw model database with RAM requirements, quantization and links to static model pages.
For coding, start with Kimi K2 Instruct (1T MoE) if your hardware fits it. If not, choose the highest-ranked model that fits your RAM tier and preferred quantization.
Moonshot AI trillion-parameter MoE flagship. 32B active params per token with 384 experts. Matches or beats GPT-4 Turbo on MMLU, GSM8K, HumanEval. Agentic & tool-use specialist. Server-grade only. Modified MIT.
Flagship open-source Qwen 3.5. Only 17B active params despite 397B total — world-class quality at MoE efficiency. Matches GPT-4o on major benchmarks. Requires multi-GPU or server-grade hardware. Apache 2.0.
Moonshot AI K2 with extended reasoning mode. Chain-of-thought traces before final answer. Top-5 on GPQA, AIME, SWE-bench. Requires datacenter-grade hardware or distributed inference. Modified MIT.
DeepSeek frontier MoE with 1M-token context, hybrid compressed attention and top-tier coding/reasoning. MIT licensed. Datacenter-grade only.
Z.ai next-generation flagship for agentic engineering. Stronger coding, long-horizon tool use, SWE-Bench Pro, Terminal-Bench and repo generation. MIT licensed.
Experimental V3.2 with DeepSeek Sparse Attention (DSA) — halves inference cost vs V3.1 on long context while keeping quality. 128K context, improved coding & tool-use. MIT licensed. Server-grade.
Zhipu AI flagship — full GLM 4.6. 200K context, strong tool-calling & agentic workflows. Competes with Claude 3.5 Sonnet on reasoning and code. MIT licensed. Server-grade hardware.
Updated flagship DeepSeek R1 with improved reasoning chains and fewer hallucinations. Major upgrade to chain-of-thought quality. MIT licensed. Server-grade only.
Near GPT-4 intelligence locally. Thinking mode demolishes hard problems. The local AI dream.
Moonshot AI's agentic flagship. 1T total MoE parameters with 32B active per forward pass. Unmatched long-context reasoning at 256K tokens. Designed for complex agentic tasks and tool use. Model License — check moonshotai.com for commercial terms.
Qwen flagship coding model. Designed for agentic coding with 256K context. Outperforms Claude 3.5 Sonnet on SWE-bench. Apache 2.0.
MiniMax MoE flagship with 10B active params and 4M-token long-context. Specialised for agentic coding and tool-use. Competitive with GPT-4 class models at a fraction of the inference cost. MIT licensed.
DeepSeek's massive MoE flagship. 37B active out of 671B total. Exceptional coding, reasoning and general capabilities. Ranks #6 on global usage leaderboards with 29B monthly tokens. MIT licensed.
Arcee AI's massive MoE open model. ~400B total parameters, 70B active per forward pass. Ranks near the top of global usage leaderboards. Exceptional versatility across reasoning, coding and chat. Free and open-source. Apache 2.0.
Mixture of Experts behemoth. Only 22B params active at once = fast despite massive size. Top-tier.
Large MoE model with only 10B active params. 60% cheaper to run than Qwen3-Max. 256K context. Top-tier reasoning, coding and multilingual. Hybrid think/non-think. Apache 2.0.
671B MoE with 37B active params. The original massive DeepSeek. 2.4M downloads. Server-grade only.
Qwen 3.6 flagship dense model. Hybrid thinking mode with /think toggle for deep chain-of-thought reasoning. 128K context, 29+ languages. Significantly outperforms Qwen3.5-27B on reasoning, coding & math. Apache 2.0.
LocalClaw ranks models using their tags plus relative benchmark scores for speed, quality, coding and reasoning. The goal is a practical local setup recommendation, not a synthetic leaderboard.