Local LLM model page

Qwen 3 Next (80B/3B MoE)

Alibaba's next-gen MoE with hybrid-gated DeltaNet attention. Only 3B active params — runs at dense 7B speed with 70B quality. 256K native context (extensible to 1M). Hybrid thinking mode. Apache 2.0.

Parameters
80B (3B active)
Minimum RAM
64 GB
Model size
48 GB
Quantization
Q4_K_M

Can Qwen 3 Next (80B/3B MoE) run locally?

Qwen 3 Next (80B/3B MoE) is best suited for high-end workstations with 64 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 64 GB RAM.

Search term for LM Studio or compatible runtimes: qwen3-next-80b-a3b

Hugging Face repository: Qwen/Qwen3-Next-80B-A3B-Instruct

chatcodereasoningpowerquality

Strengths

  • Alibaba's next-gen MoE with hybrid-gated DeltaNet attention. Only 3B active params — runs at dense 7B speed with 70B quality. 256K native context (extensible to 1M). Hybrid thinking mode. Apache 2.0.

Limitations

  • Performance depends heavily on quantization, RAM bandwidth and runtime support.

Best use cases

  • chat
  • code
  • reasoning
  • power
  • quality

Benchmarks

Speed: 8/10

Quality: 9/10

Coding: 9/10

Reasoning: 9/10

Technical details

Developer: qwen-moe

License: See model repository

Context window: Unknown tokens

Architecture: See model card

Released: 2025-09