Local LLM model page

Mixtral (8x7B)

Mistral's MoE pioneer. 46.7B total, fast inference via sparse activation. Multilingual. 1.4M downloads.

Parameters
8x7B (46.7B)
Minimum RAM
32 GB
Model size
26 GB
Quantization
Q4_K_M

Can Mixtral (8x7B) run locally?

Mixtral (8x7B) is best suited for power-user machines with 32 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 32 GB RAM.

Search term for LM Studio or compatible runtimes: mixtral-8x7b-instruct

Hugging Face repository: TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF

chatgeneralpowerquality

Strengths

  • MoE pioneer
  • Fast despite 46.7B total params
  • Apache 2.0
  • 1.4M downloads
  • Multilingual

Limitations

  • Needs 32GB RAM
  • 32K context limit
  • Superseded by newer MoE models

Best use cases

  • General chat
  • Multilingual tasks
  • Enterprise
  • RAG applications

Benchmarks

Speed: 4/10

Quality: 8/10

Coding: 8/10

Reasoning: 8/10

Technical details

Developer: Mistral AI

License: Apache 2.0

Context window: 32,768 tokens

Architecture: Sparse Mixture of Experts — 8 experts, 2 active per token

Released: 2024-01