Local LLM model page

Gemma 3 (12B)

Google's 12B multimodal beast. Understands images natively. Excellent quality for 16GB machines.

Parameters
12B
Minimum RAM
16 GB
Model size
8 GB
Quantization
Q4_K_M

Can Gemma 3 (12B) run locally?

Gemma 3 (12B) is best suited for mainstream Macs and PCs with 16 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 16 GB RAM.

Search term for LM Studio or compatible runtimes: gemma-3-12b-it

Hugging Face repository: lmstudio-community/gemma-3-12B-it-GGUF

chatvisionpowergeneral

Strengths

  • 128K context at 12B size
  • Vision support
  • Strong multilingual
  • Great price/performance

Limitations

  • Needs 16GB RAM
  • Not best-in-class for coding

Best use cases

  • Long document analysis
  • Multilingual assistant
  • Image + text tasks
  • Research

Benchmarks

Speed: 6/10

Quality: 8/10

Coding: 7/10

Reasoning: 8/10

Technical details

Developer: Google DeepMind

License: Gemma License

Context window: 131,072 tokens

Architecture: Transformer with 128K context, vision support

Released: 2025-03