Local LLM model page

Kimi K2 Instruct (1T MoE)

Moonshot AI trillion-parameter MoE flagship. 32B active params per token with 384 experts. Matches or beats GPT-4 Turbo on MMLU, GSM8K, HumanEval. Agentic & tool-use specialist. Server-grade only. Modified MIT.

Parameters
1T (32B active, 384 experts)
Minimum RAM
1024 GB
Model size
600 GB
Quantization
Q4_K_M

Can Kimi K2 Instruct (1T MoE) run locally?

Kimi K2 Instruct (1T MoE) is best suited for server-grade or multi-GPU systems. LocalClaw recommends Q4_K_M as the default quantization, with at least 1024 GB RAM.

Search term for LM Studio or compatible runtimes: kimi-k2-instruct

Hugging Face repository: moonshotai/Kimi-K2-Instruct

chatcodereasoningqualitygeneral

Strengths

  • Moonshot AI trillion-parameter MoE flagship. 32B active params per token with 384 experts. Matches or beats GPT-4 Turbo on MMLU, GSM8K, HumanEval. Agentic & tool-use specialist. Server-grade only. Modified MIT.

Limitations

  • Performance depends heavily on quantization, RAM bandwidth and runtime support.

Best use cases

  • chat
  • code
  • reasoning
  • quality
  • general

Benchmarks

Speed: 3/10

Quality: 10/10

Coding: 10/10

Reasoning: 10/10

Technical details

Developer: kimi

License: See model repository

Context window: Unknown tokens

Architecture: See model card

Released: 2025-07