Local LLM model page
Mixtral (8x7B)
Mistral's MoE pioneer. 46.7B total, fast inference via sparse activation. Multilingual. 1.4M downloads.
Parameters
8x7B (46.7B)
Minimum RAM
32 GB
Model size
26 GB
Quantization
Q4_K_M
Can Mixtral (8x7B) run locally?
Mixtral (8x7B) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 32 GB RAM.
Search term for LM Studio or compatible runtimes: mixtral-8x7b-instruct
Hugging Face repository: TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF
chatgeneralpowerquality
Strengths
- MoE pioneer
- Fast despite 46.7B total params
- Apache 2.0
- 1.4M downloads
- Multilingual
Limitations
- Needs 32GB RAM
- 32K context limit
- Superseded by newer MoE models
Best use cases
- General chat
- Multilingual tasks
- Enterprise
- RAG applications
Benchmarks
Speed: 4/10
Quality: 8/10
Coding: 8/10
Reasoning: 8/10
Technical details
Developer: Mistral AI
License: Apache 2.0
Context window: 32,768 tokens
Architecture: Sparse Mixture of Experts — 8 experts, 2 active per token
Released: 2024-01