Local LLM model page
Gemma 2 (9B)
Google Gemma 2nd gen. Excellent quality-to-size ratio. 8.1M downloads. Great all-around model.
Parameters
9B
Minimum RAM
8 GB
Model size
5.5 GB
Quantization
Q5_K_M
Can Gemma 2 (9B) run locally?
Gemma 2 (9B) is best suited for entry-level laptops and desktops. LocalClaw recommends Q5_K_M as the default quantization, with at least 8 GB RAM.
Search term for LM Studio or compatible runtimes: gemma-2-9b-it
Hugging Face repository: lmstudio-community/gemma-2-9B-it-GGUF
chatcodestandardgeneral
Strengths
- Excellent quality-to-size ratio
- 8.1M+ downloads — battle-tested
- Strong reasoning for 9B
- Good coding abilities
Limitations
- Only 8K context window
- Gemma license more restrictive than Apache/MIT
- Not the best for non-English tasks
Best use cases
- General chat assistant
- Code completion
- Content writing
- Summarization
- Light reasoning tasks
Benchmarks
Speed: 8/10
Quality: 7/10
Coding: 7/10
Reasoning: 7/10
Technical details
Developer: Google DeepMind
License: Gemma License
Context window: 8,192 tokens
Architecture: Transformer (decoder-only) with sliding window attention
Released: 2024-06