Local LLM model page

Qwen 3 MoE (235B/22B active)

Mixture of Experts behemoth. Only 22B params active at once = fast despite massive size. Top-tier.

Parameters
235B (22B active)
Minimum RAM
96 GB
Model size
80 GB
Quantization
Q4_K_M

Can Qwen 3 MoE (235B/22B active) run locally?

Qwen 3 MoE (235B/22B active) is best suited for large-memory workstations. LocalClaw recommends Q4_K_M as the default quantization, with at least 96 GB RAM.

Search term for LM Studio or compatible runtimes: qwen3-235b-a22b

Hugging Face repository: lmstudio-community/Qwen3-235B-A22B-GGUF

chatcodereasoningquality

Strengths

  • Only 22B active despite 235B total
  • Fast for its power level
  • Apache 2.0
  • Top-tier quality

Limitations

  • Requires 96GB+ RAM
  • Complex MoE deployment
  • Very large files

Best use cases

  • Maximum quality AI
  • Enterprise deployment
  • Research
  • Complex reasoning

Benchmarks

Speed: 3/10

Quality: 10/10

Coding: 10/10

Reasoning: 10/10

Technical details

Developer: Alibaba Cloud (Qwen Team)

License: Apache 2.0

Context window: 131,072 tokens

Architecture: Mixture of Experts — 235B total, 22B active per token

Released: 2025-04