Local LLM model page

OLMo 2 (32B)

Allen AI fully open 32B model. Weights, data, training code all public. Strong general purpose at 32B. Apache 2.0.

Find the best model for my hardware Browse all 183 LLMs

Parameters

32B

Minimum RAM

24 GB

Model size

19 GB

Quantization

Q4_K_M

Can OLMo 2 (32B) run locally?

OLMo 2 (32B) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 24 GB RAM.

Search term for LM Studio or compatible runtimes: olmo2-32b-instruct

Hugging Face repository: allenai/OLMo-2-0325-32B-Instruct-GGUF

chatpowergeneral

Strengths

100% open — weights, data, training code, and training logs all public
Strong general performance at 32B
Apache 2.0
Fully reproducible research

Limitations

Not as strong as Qwen 3 32B on benchmarks
Limited non-English support
Needs 24GB+ RAM

Best use cases

Research and reproducibility
General purpose assistant
Fine-tuning base
Academic benchmarking
Transparency-focused deployments

Benchmarks

Speed: 4/10

Quality: 8/10

Coding: 7/10

Reasoning: 8/10

Technical details

Developer: Allen AI (AI2)

License: Apache 2.0

Context window: 32,768 tokens

Architecture: Transformer (decoder-only) — fully open training pipeline

Released: 2025-03

Similar models

qwen3-32b glm4-32b deepseek-r1-32b