Local LLM model page
DeepSeek V3.1 (671B MoE)
Hybrid thinking/non-thinking model. Full 671B MoE for maximum quality, 37B active at inference. Significant step up from V3.0. Requires server-grade hardware. MIT licensed.
Parameters
671B (37B active, MoE)
Minimum RAM
512 GB
Model size
360 GB
Quantization
Q4_K_M
Can DeepSeek V3.1 (671B MoE) run locally?
DeepSeek V3.1 (671B MoE) is best suited for server-grade or multi-GPU systems. LocalClaw recommends Q4_K_M as the default quantization, with at least 512 GB RAM.
Search term for LM Studio or compatible runtimes: deepseek-v3.1
Hugging Face repository: unsloth/DeepSeek-V3.1-GGUF
chatreasoningquality
Strengths
- Hybrid thinking/non-thinking mode
- Only 37B active parameters despite 671B total
- Top-tier quality
- Among best open models ever
Limitations
- Requires 512GB+ RAM for full model
- Server-grade hardware only
- Complex setup
Best use cases
- Maximum quality outputs
- Research
- Enterprise deployment
- Frontier AI tasks
Benchmarks
Speed: 1/10
Quality: 10/10
Coding: 10/10
Reasoning: 10/10
Technical details
Developer: DeepSeek AI
License: DeepSeek License
Context window: 131,072 tokens
Architecture: Mixture of Experts (MoE) — 671B total, ~37B active
Released: 2025-08