Local LLM model page

Kimi Linear 48B-A3B Instruct

Moonshot AI efficient Kimi model with linear-attention style architecture and 3B active parameters. Strong long-context, reasoning and coding signal. MIT licensed.

Find the best model for my hardware Browse all 183 LLMs

Parameters

48B (3B active, MoE)

Minimum RAM

48 GB

Model size

28 GB

Quantization

Q4_K_M

Can Kimi Linear 48B-A3B Instruct run locally?

Kimi Linear 48B-A3B Instruct is best suited for high-end workstations with 64 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 48 GB RAM.

Search term for LM Studio or compatible runtimes: kimi-linear-48b-a3b-instruct

Hugging Face repository: moonshotai/Kimi-Linear-48B-A3B-Instruct

chatcodereasoningpowermoelong-context

Strengths

Moonshot AI efficient Kimi model with linear-attention style architecture and 3B active parameters. Strong long-context, reasoning and coding signal. MIT licensed.

Limitations

Performance depends heavily on quantization, RAM bandwidth and runtime support.

Best use cases

chat
code
reasoning
power
moe
long-context

Benchmarks

Speed: 6/10

Quality: 8/10

Coding: 8/10

Reasoning: 8/10

Technical details

Developer: kimi

License: See model repository

Context window: Unknown tokens

Architecture: See model card

Released: 2025-12