Local LLM model page

Llama 3.2 Vision (11B)

Meta's vision-enabled Llama. Image reasoning + text generation. 2.6M downloads.

Parameters
11B
Minimum RAM
12 GB
Model size
6.5 GB
Quantization
Q4_K_M

Can Llama 3.2 Vision (11B) run locally?

Llama 3.2 Vision (11B) is best suited for mainstream Macs and PCs with 16 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 12 GB RAM.

Search term for LM Studio or compatible runtimes: llama-3.2-11b-vision-instruct

Hugging Face repository: lmstudio-community/Llama-3.2-11B-Vision-Instruct-GGUF

visionstandard

Strengths

  • Native vision understanding
  • 128K context
  • 2.6M downloads
  • Image reasoning + text generation

Limitations

  • Needs 12GB RAM
  • Vision is basic vs specialized models
  • Llama license restrictions

Best use cases

  • Image description
  • Visual Q&A
  • Document understanding
  • Multimodal chat

Benchmarks

Speed: 6/10

Quality: 7/10

Coding: 5/10

Reasoning: 7/10

Technical details

Developer: Meta AI

License: Llama 3.2 Community License

Context window: 131,072 tokens

Architecture: Multimodal Transformer (vision + language)

Released: 2024-09