Local LLM model page

Gemma 2 (9B)

Google Gemma 2nd gen. Excellent quality-to-size ratio. 8.1M downloads. Great all-around model.

Find the best model for my hardware Browse all 183 LLMs

Parameters

Minimum RAM

8 GB

Model size

5.5 GB

Quantization

Q5_K_M

Can Gemma 2 (9B) run locally?

Gemma 2 (9B) is best suited for entry-level laptops and desktops. LocalClaw recommends Q5_K_M as the default quantization, with at least 8 GB RAM.

Search term for LM Studio or compatible runtimes: gemma-2-9b-it

Hugging Face repository: lmstudio-community/gemma-2-9B-it-GGUF

chatcodestandardgeneral

Strengths

Excellent quality-to-size ratio
8.1M+ downloads — battle-tested
Strong reasoning for 9B
Good coding abilities

Limitations

Only 8K context window
Gemma license more restrictive than Apache/MIT
Not the best for non-English tasks

Best use cases

General chat assistant
Code completion
Content writing
Summarization
Light reasoning tasks

Benchmarks

Speed: 8/10

Quality: 7/10

Coding: 7/10

Reasoning: 7/10

Technical details

Developer: Google DeepMind

License: Gemma License

Context window: 8,192 tokens

Architecture: Transformer (decoder-only) with sliding window attention

Released: 2024-06