Local LLM model page

Mixtral (8x7B)

Mistral's MoE pioneer. 46.7B total, fast inference via sparse activation. Multilingual. 1.4M downloads.

Find the best model for my hardware Browse all 183 LLMs

Parameters

8x7B (46.7B)

Minimum RAM

32 GB

Model size

26 GB

Quantization

Q4_K_M

Can Mixtral (8x7B) run locally?

Mixtral (8x7B) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 32 GB RAM.

Search term for LM Studio or compatible runtimes: mixtral-8x7b-instruct

Hugging Face repository: TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF

chatgeneralpowerquality

Strengths

MoE pioneer
Fast despite 46.7B total params
Apache 2.0
1.4M downloads
Multilingual

Limitations

Needs 32GB RAM
32K context limit
Superseded by newer MoE models

Best use cases

General chat
Multilingual tasks
Enterprise
RAG applications

Benchmarks

Speed: 4/10

Quality: 8/10

Coding: 8/10

Reasoning: 8/10

Technical details

Developer: Mistral AI

License: Apache 2.0

Context window: 32,768 tokens

Architecture: Sparse Mixture of Experts — 8 experts, 2 active per token

Released: 2024-01