Local LLM model page
Kimi K2 Instruct (1T MoE)
Moonshot AI trillion-parameter MoE flagship. 32B active params per token with 384 experts. Matches or beats GPT-4 Turbo on MMLU, GSM8K, HumanEval. Agentic & tool-use specialist. Server-grade only. Modified MIT.
Parameters
1T (32B active, 384 experts)
Minimum RAM
1024 GB
Model size
600 GB
Quantization
Q4_K_M
Can Kimi K2 Instruct (1T MoE) run locally?
Kimi K2 Instruct (1T MoE) is best suited for server-grade or multi-GPU systems. LocalClaw recommends Q4_K_M as the default quantization, with at least 1024 GB RAM.
Search term for LM Studio or compatible runtimes: kimi-k2-instruct
Hugging Face repository: moonshotai/Kimi-K2-Instruct
chatcodereasoningqualitygeneral
Strengths
- Moonshot AI trillion-parameter MoE flagship. 32B active params per token with 384 experts. Matches or beats GPT-4 Turbo on MMLU, GSM8K, HumanEval. Agentic & tool-use specialist. Server-grade only. Modified MIT.
Limitations
- Performance depends heavily on quantization, RAM bandwidth and runtime support.
Best use cases
- chat
- code
- reasoning
- quality
- general
Benchmarks
Speed: 3/10
Quality: 10/10
Coding: 10/10
Reasoning: 10/10
Technical details
Developer: kimi
License: See model repository
Context window: Unknown tokens
Architecture: See model card
Released: 2025-07