Local LLM model page

Kimi K2 Instruct (1T MoE)

Moonshot AI trillion-parameter MoE flagship. 32B active params per token with 384 experts. Matches or beats GPT-4 Turbo on MMLU, GSM8K, HumanEval. Agentic & tool-use specialist. Server-grade only. Modified MIT.

Find the best model for my hardware Browse all 183 LLMs

Parameters

1T (32B active, 384 experts)

Minimum RAM

1024 GB

Model size

600 GB

Quantization

Q4_K_M

Can Kimi K2 Instruct (1T MoE) run locally?

Kimi K2 Instruct (1T MoE) is best suited for server-grade or multi-GPU systems. LocalClaw recommends Q4_K_M as the default quantization, with at least 1024 GB RAM.

Search term for LM Studio or compatible runtimes: kimi-k2-instruct

Hugging Face repository: moonshotai/Kimi-K2-Instruct

chatcodereasoningqualitygeneral

Strengths

Moonshot AI trillion-parameter MoE flagship. 32B active params per token with 384 experts. Matches or beats GPT-4 Turbo on MMLU, GSM8K, HumanEval. Agentic & tool-use specialist. Server-grade only. Modified MIT.

Limitations

Performance depends heavily on quantization, RAM bandwidth and runtime support.

Best use cases

chat
code
reasoning
quality
general

Benchmarks

Speed: 3/10

Quality: 10/10

Coding: 10/10

Reasoning: 10/10

Technical details

Developer: kimi

License: See model repository

Context window: Unknown tokens

Architecture: See model card

Released: 2025-07