Local LLM model page
Llama 3.3 (70B)
Meta's 70B workhorse. Good finetune ecosystem. Outperformed by GLM 4.5 Air and DeepSeek V3.2 for raw quality.
Parameters
70B
Minimum RAM
48 GB
Model size
42 GB
Quantization
Q4_K_M
Can Llama 3.3 (70B) run locally?
Llama 3.3 (70B) is best suited for high-end workstations with 64 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 48 GB RAM.
Search term for LM Studio or compatible runtimes: llama-3.3-70b-instruct
Hugging Face repository: lmstudio-community/Llama-3.3-70B-Instruct-GGUF
chatpowerqualitygeneral
Strengths
- Best Llama model to date
- Matches Llama 3.1 405B on some tasks
- Strong coding and reasoning
- 128K context
Limitations
- Requires 48GB+ RAM
- Slow inference
Best use cases
- Maximum quality local AI
- Enterprise
- Research
- Complex analysis
Benchmarks
Speed: 2/10
Quality: 9/10
Coding: 8/10
Reasoning: 8/10
Technical details
Developer: Meta AI
License: Llama 3.3 Community License
Context window: 131,072 tokens
Architecture: Transformer with GQA, 128K context
Released: 2024-12