Local LLM model page

Kimi K2.5 (32B/1T MoE)

Moonshot AI's agentic flagship. 1T total MoE parameters with 32B active per forward pass. Unmatched long-context reasoning at 256K tokens. Designed for complex agentic tasks and tool use. Model License — check moonshotai.com for commercial terms.

Find the best model for my hardware Browse all 183 LLMs

Parameters

32B active (1T total MoE)

Minimum RAM

32 GB

Model size

22 GB

Quantization

Q4_K_M

Can Kimi K2.5 (32B/1T MoE) run locally?

Kimi K2.5 (32B/1T MoE) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 32 GB RAM.

Search term for LM Studio or compatible runtimes: kimi-k2.5-32b

Hugging Face repository: moonshotai/Kimi-K2.5-32B-GGUF

chatcodereasoningpowerquality

Strengths

Massive 1T MoE with 32B active
256K context
Unmatched reasoning
Top-tier quality

Limitations

Needs 32GB+ RAM
Model license restrictions
Complex setup

Best use cases

Long document processing
Complex reasoning
Enterprise AI
Research

Benchmarks

Speed: 4/10

Quality: 10/10

Coding: 10/10

Reasoning: 10/10

Technical details

Developer: Moonshot AI

License: Model License

Context window: 262,144 tokens

Architecture: Mixture of Experts — 1T total, 32B active

Released: 2026-01

Similar models

qwen3-32b deepseek-r1-32b cogito-32b