Local LLM model page

OLMo 2 (32B)

Allen AI fully open 32B model. Weights, data, training code all public. Strong general purpose at 32B. Apache 2.0.

Parameters
32B
Minimum RAM
24 GB
Model size
19 GB
Quantization
Q4_K_M

Can OLMo 2 (32B) run locally?

OLMo 2 (32B) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 24 GB RAM.

Search term for LM Studio or compatible runtimes: olmo2-32b-instruct

Hugging Face repository: allenai/OLMo-2-0325-32B-Instruct-GGUF

chatpowergeneral

Strengths

  • 100% open — weights, data, training code, and training logs all public
  • Strong general performance at 32B
  • Apache 2.0
  • Fully reproducible research

Limitations

  • Not as strong as Qwen 3 32B on benchmarks
  • Limited non-English support
  • Needs 24GB+ RAM

Best use cases

  • Research and reproducibility
  • General purpose assistant
  • Fine-tuning base
  • Academic benchmarking
  • Transparency-focused deployments

Benchmarks

Speed: 4/10

Quality: 8/10

Coding: 7/10

Reasoning: 8/10

Technical details

Developer: Allen AI (AI2)

License: Apache 2.0

Context window: 32,768 tokens

Architecture: Transformer (decoder-only) — fully open training pipeline

Released: 2025-03