Local LLM model page
Gemma 3 (4B)
Google's multimodal gem. Understands text AND images natively. Great quality-to-size ratio.
Parameters
4B
Minimum RAM
8 GB
Model size
3 GB
Quantization
Q5_K_M
Can Gemma 3 (4B) run locally?
Gemma 3 (4B) is best suited for entry-level laptops and desktops. LocalClaw recommends Q5_K_M as the default quantization, with at least 8 GB RAM.
Search term for LM Studio or compatible runtimes: gemma-3-4b-it
Hugging Face repository: lmstudio-community/gemma-3-4B-it-GGUF
chatvisionstandardgeneral
Strengths
- 128K context window at only 4B
- Multimodal (image understanding)
- Excellent for its size
- 140+ languages
Limitations
- Not as strong as 8B+ models on hard tasks
- Vision capabilities basic compared to specialized models
Best use cases
- Long document processing
- Multilingual chat
- Basic image analysis
- Mobile/edge deployment
Benchmarks
Speed: 9/10
Quality: 6/10
Coding: 5/10
Reasoning: 6/10
Technical details
Developer: Google DeepMind
License: Gemma License
Context window: 131,072 tokens
Architecture: Transformer with 128K context, vision support
Released: 2025-03