Open-weight local LLM

Kimi K2.7 Code (1T MoE)

Moonshot AI coding-focused agentic Kimi built on K2.6. 1T MoE with 32B active parameters, 256K context, MoonViT vision encoder and stronger long-horizon coding while reducing thinking-token usage by roughly 30% vs K2.6. Modified MIT. Server-grade only.

Server-grade 1024 GB RAM BF16 / compressed-tensors Agentic coding research
Parameters
1T (32B active)
Minimum RAM
1024 GB
Model size
595 GB
Quantization
BF16 / compressed-tensors

Can Kimi K2.7 Code (1T MoE) run locally?

Kimi K2.7 Code (1T MoE) is server-grade locally. Keep it for comparison unless you have very large unified memory, multiple GPUs or remote inference.

Search for kimi-k2.7-code in LM Studio or another GGUF-compatible runtime.

codereasoningagenticmultimodalquality

Install path

01
Check RAM fitMinimum 1024 GB RAM. Start with the BF16 / compressed-tensors quant.
02
Load the modelSearch kimi-k2.7-code in LM Studio.
03
Control locallyUse LocalClaw to manage models, agents, chat, channels and scheduled OpenClaw work.

Strengths

  • Coding-focused Kimi K2.7 release
  • 1T total parameters with 32B active
  • 256K context window
  • Improved long-horizon coding and agentic benchmarks
  • About 30% less thinking-token usage than Kimi K2.6 according to Moonshot

Limitations

  • Server-grade only, not a consumer local model
  • Requires custom Transformers code and very large storage
  • License is modified MIT, not plain Apache/MIT
  • LM Studio/GGUF support may lag the official Transformers release

Best use cases

  • Agentic coding research
  • Large repository refactors
  • Long-horizon software engineering benchmarks
  • Multimodal code/document workflows

Capability profile

speed
2
quality
10
coding
10
reasoning
10

Technical notes

Developer
Moonshot AI
License
Modified MIT
Context window
262,144 tokens
Architecture
Mixture of Experts — 1T total, 32B active, 384 experts, MLA attention, MoonViT vision encoder

This model fits these next steps

Hardware fit is based on LocalClaw's RAM tier, model size and quantization metadata. Always leave memory headroom for your OS and runtime.

Similar models to compare

Where to go next