Local LLM model page

Kimi Linear 48B-A3B Instruct

Moonshot AI efficient Kimi model with linear-attention style architecture and 3B active parameters. Strong long-context, reasoning and coding signal. MIT licensed.

Parameters
48B (3B active, MoE)
Minimum RAM
48 GB
Model size
28 GB
Quantization
Q4_K_M

Can Kimi Linear 48B-A3B Instruct run locally?

Kimi Linear 48B-A3B Instruct is best suited for high-end workstations with 64 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 48 GB RAM.

Search term for LM Studio or compatible runtimes: kimi-linear-48b-a3b-instruct

Hugging Face repository: moonshotai/Kimi-Linear-48B-A3B-Instruct

chatcodereasoningpowermoelong-context

Strengths

  • Moonshot AI efficient Kimi model with linear-attention style architecture and 3B active parameters. Strong long-context, reasoning and coding signal. MIT licensed.

Limitations

  • Performance depends heavily on quantization, RAM bandwidth and runtime support.

Best use cases

  • chat
  • code
  • reasoning
  • power
  • moe
  • long-context

Benchmarks

Speed: 6/10

Quality: 8/10

Coding: 8/10

Reasoning: 8/10

Technical details

Developer: kimi

License: See model repository

Context window: Unknown tokens

Architecture: See model card

Released: 2025-12