Local LLM model page
OLMo 2 (32B)
Allen AI fully open 32B model. Weights, data, training code all public. Strong general purpose at 32B. Apache 2.0.
Parameters
32B
Minimum RAM
24 GB
Model size
19 GB
Quantization
Q4_K_M
Can OLMo 2 (32B) run locally?
OLMo 2 (32B) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 24 GB RAM.
Search term for LM Studio or compatible runtimes: olmo2-32b-instruct
Hugging Face repository: allenai/OLMo-2-0325-32B-Instruct-GGUF
chatpowergeneral
Strengths
- 100% open — weights, data, training code, and training logs all public
- Strong general performance at 32B
- Apache 2.0
- Fully reproducible research
Limitations
- Not as strong as Qwen 3 32B on benchmarks
- Limited non-English support
- Needs 24GB+ RAM
Best use cases
- Research and reproducibility
- General purpose assistant
- Fine-tuning base
- Academic benchmarking
- Transparency-focused deployments
Benchmarks
Speed: 4/10
Quality: 8/10
Coding: 7/10
Reasoning: 8/10
Technical details
Developer: Allen AI (AI2)
License: Apache 2.0
Context window: 32,768 tokens
Architecture: Transformer (decoder-only) — fully open training pipeline
Released: 2025-03