Local LLM model page

DeepSeek V3.1 (671B MoE)

Hybrid thinking/non-thinking model. Full 671B MoE for maximum quality, 37B active at inference. Significant step up from V3.0. Requires server-grade hardware. MIT licensed.

Parameters
671B (37B active, MoE)
Minimum RAM
512 GB
Model size
360 GB
Quantization
Q4_K_M

Can DeepSeek V3.1 (671B MoE) run locally?

DeepSeek V3.1 (671B MoE) is best suited for server-grade or multi-GPU systems. LocalClaw recommends Q4_K_M as the default quantization, with at least 512 GB RAM.

Search term for LM Studio or compatible runtimes: deepseek-v3.1

Hugging Face repository: unsloth/DeepSeek-V3.1-GGUF

chatreasoningquality

Strengths

  • Hybrid thinking/non-thinking mode
  • Only 37B active parameters despite 671B total
  • Top-tier quality
  • Among best open models ever

Limitations

  • Requires 512GB+ RAM for full model
  • Server-grade hardware only
  • Complex setup

Best use cases

  • Maximum quality outputs
  • Research
  • Enterprise deployment
  • Frontier AI tasks

Benchmarks

Speed: 1/10

Quality: 10/10

Coding: 10/10

Reasoning: 10/10

Technical details

Developer: DeepSeek AI

License: DeepSeek License

Context window: 131,072 tokens

Architecture: Mixture of Experts (MoE) — 671B total, ~37B active

Released: 2025-08