Local LLM model page
Llama 4 Scout (17B/16E MoE)
Meta's multimodal MoE model. 17B active params across 16 experts (~109B total). Built-in image understanding. 10M token context window. Apache 2.0. 728K downloads.
Parameters
17B active (109B total, 16 experts)
Minimum RAM
16 GB
Model size
10 GB
Quantization
Q4_K_M
Can Llama 4 Scout (17B/16E MoE) run locally?
Llama 4 Scout (17B/16E MoE) is best suited for mainstream Macs and PCs with 16 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 16 GB RAM.
Search term for LM Studio or compatible runtimes: llama-4-scout
Hugging Face repository: meta-llama/Llama-4-Scout-17B-16E-Instruct-GGUF
chatvisionpowergeneral
Strengths
- Latest Meta multimodal model
- Built-in vision
- MoE for efficient inference
- 728K downloads
Limitations
- Needs 16GB RAM
- New model — less community support
- MoE complexity
Best use cases
- Multimodal chat
- Image understanding
- General AI tasks
- Content creation
Benchmarks
Speed: 6/10
Quality: 8/10
Coding: 8/10
Reasoning: 8/10
Technical details
Developer: Meta AI
License: Llama 4 Community License
Context window: 131,072 tokens
Architecture: Mixture of Experts (MoE) with native vision
Released: 2025-04