Local LLM model page

Kimi K2.5 (32B/1T MoE)

Moonshot AI's agentic flagship. 1T total MoE parameters with 32B active per forward pass. Unmatched long-context reasoning at 256K tokens. Designed for complex agentic tasks and tool use. Model License — check moonshotai.com for commercial terms.

Parameters
32B active (1T total MoE)
Minimum RAM
32 GB
Model size
22 GB
Quantization
Q4_K_M

Can Kimi K2.5 (32B/1T MoE) run locally?

Kimi K2.5 (32B/1T MoE) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 32 GB RAM.

Search term for LM Studio or compatible runtimes: kimi-k2.5-32b

Hugging Face repository: moonshotai/Kimi-K2.5-32B-GGUF

chatcodereasoningpowerquality

Strengths

  • Massive 1T MoE with 32B active
  • 256K context
  • Unmatched reasoning
  • Top-tier quality

Limitations

  • Needs 32GB+ RAM
  • Model license restrictions
  • Complex setup

Best use cases

  • Long document processing
  • Complex reasoning
  • Enterprise AI
  • Research

Benchmarks

Speed: 4/10

Quality: 10/10

Coding: 10/10

Reasoning: 10/10

Technical details

Developer: Moonshot AI

License: Model License

Context window: 262,144 tokens

Architecture: Mixture of Experts — 1T total, 32B active

Released: 2026-01