Local LLM model page
Kimi K2.5 (32B/1T MoE)
Moonshot AI's agentic flagship. 1T total MoE parameters with 32B active per forward pass. Unmatched long-context reasoning at 256K tokens. Designed for complex agentic tasks and tool use. Model License — check moonshotai.com for commercial terms.
Parameters
32B active (1T total MoE)
Minimum RAM
32 GB
Model size
22 GB
Quantization
Q4_K_M
Can Kimi K2.5 (32B/1T MoE) run locally?
Kimi K2.5 (32B/1T MoE) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 32 GB RAM.
Search term for LM Studio or compatible runtimes: kimi-k2.5-32b
Hugging Face repository: moonshotai/Kimi-K2.5-32B-GGUF
chatcodereasoningpowerquality
Strengths
- Massive 1T MoE with 32B active
- 256K context
- Unmatched reasoning
- Top-tier quality
Limitations
- Needs 32GB+ RAM
- Model license restrictions
- Complex setup
Best use cases
- Long document processing
- Complex reasoning
- Enterprise AI
- Research
Benchmarks
Speed: 4/10
Quality: 10/10
Coding: 10/10
Reasoning: 10/10
Technical details
Developer: Moonshot AI
License: Model License
Context window: 262,144 tokens
Architecture: Mixture of Experts — 1T total, 32B active
Released: 2026-01