Local LLM model page

Llama 4 Scout (17B/16E MoE)

Meta's multimodal MoE model. 17B active params across 16 experts (~109B total). Built-in image understanding. 10M token context window. Apache 2.0. 728K downloads.

Parameters
17B active (109B total, 16 experts)
Minimum RAM
16 GB
Model size
10 GB
Quantization
Q4_K_M

Can Llama 4 Scout (17B/16E MoE) run locally?

Llama 4 Scout (17B/16E MoE) is best suited for mainstream Macs and PCs with 16 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 16 GB RAM.

Search term for LM Studio or compatible runtimes: llama-4-scout

Hugging Face repository: meta-llama/Llama-4-Scout-17B-16E-Instruct-GGUF

chatvisionpowergeneral

Strengths

  • Latest Meta multimodal model
  • Built-in vision
  • MoE for efficient inference
  • 728K downloads

Limitations

  • Needs 16GB RAM
  • New model — less community support
  • MoE complexity

Best use cases

  • Multimodal chat
  • Image understanding
  • General AI tasks
  • Content creation

Benchmarks

Speed: 6/10

Quality: 8/10

Coding: 8/10

Reasoning: 8/10

Technical details

Developer: Meta AI

License: Llama 4 Community License

Context window: 131,072 tokens

Architecture: Mixture of Experts (MoE) with native vision

Released: 2025-04