Local LLM model page

QwQ (32B)

Early Qwen reasoning model. Superseded by GLM-4 32B and Qwen 3 32B for most tasks. Still decent for pure math.

Find the best model for my hardware Browse all 183 LLMs

Parameters

32B

Minimum RAM

24 GB

Model size

19 GB

Quantization

Q4_K_M

Can QwQ (32B) run locally?

QwQ (32B) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 24 GB RAM.

Search term for LM Studio or compatible runtimes: qwq-32b-preview

Hugging Face repository: lmstudio-community/QwQ-32B-Preview-GGUF

reasoningpower

Strengths

o1-class reasoning
Shows chain-of-thought process
Apache 2.0
Strong math/logic

Limitations

Verbose outputs (thinking tokens)
Slower due to reasoning overhead
Needs 24GB+ RAM

Best use cases

Complex math problems
Logical reasoning
Scientific analysis
Strategic planning

Benchmarks

Speed: 4/10

Quality: 7/10

Coding: 6/10

Reasoning: 8/10

Technical details

Developer: Alibaba Cloud (Qwen Team)

License: Apache 2.0

Context window: 131,072 tokens

Architecture: Reasoning-focused Transformer

Released: 2024-11