Local LLM model page

DeepSeek V3.1 (671B MoE)

Hybrid thinking/non-thinking model. Full 671B MoE for maximum quality, 37B active at inference. Significant step up from V3.0. Requires server-grade hardware. MIT licensed.

Find the best model for my hardware Browse all 183 LLMs

Parameters

671B (37B active, MoE)

Minimum RAM

512 GB

Model size

360 GB

Quantization

Q4_K_M

Can DeepSeek V3.1 (671B MoE) run locally?

DeepSeek V3.1 (671B MoE) is best suited for server-grade or multi-GPU systems. LocalClaw recommends Q4_K_M as the default quantization, with at least 512 GB RAM.

Search term for LM Studio or compatible runtimes: deepseek-v3.1

Hugging Face repository: unsloth/DeepSeek-V3.1-GGUF

chatreasoningquality

Strengths

Hybrid thinking/non-thinking mode
Only 37B active parameters despite 671B total
Top-tier quality
Among best open models ever

Limitations

Requires 512GB+ RAM for full model
Server-grade hardware only
Complex setup

Best use cases

Maximum quality outputs
Research
Enterprise deployment
Frontier AI tasks

Benchmarks

Speed: 1/10

Quality: 10/10

Coding: 10/10

Reasoning: 10/10

Technical details

Developer: DeepSeek AI

License: DeepSeek License

Context window: 131,072 tokens

Architecture: Mixture of Experts (MoE) — 671B total, ~37B active

Released: 2025-08

Similar models

qwen3-235b-moe llama4-maverick