Local LLM model page

Mistral Nemo (12B)

Mistral x NVIDIA 128K context model. Excellent for long documents and conversations. 2.7M downloads.

Parameters
12B
Minimum RAM
12 GB
Model size
7.1 GB
Quantization
Q5_K_M

Can Mistral Nemo (12B) run locally?

Mistral Nemo (12B) is best suited for mainstream Macs and PCs with 16 GB RAM. LocalClaw recommends Q5_K_M as the default quantization, with at least 12 GB RAM.

Search term for LM Studio or compatible runtimes: mistral-nemo-instruct

Hugging Face repository: lmstudio-community/Mistral-Nemo-Instruct-2407-GGUF

chatgeneralstandard

Strengths

  • 128K context
  • Co-developed with NVIDIA
  • 11 languages
  • Apache 2.0
  • Great reasoning

Limitations

  • Superseded by Mistral Small 3
  • Needs 12GB RAM

Best use cases

  • Multilingual applications
  • Long document processing
  • RAG
  • Coding

Benchmarks

Speed: 7/10

Quality: 8/10

Coding: 7/10

Reasoning: 7/10

Technical details

Developer: Mistral AI × NVIDIA

License: Apache 2.0

Context window: 131,072 tokens

Architecture: Transformer with 128K context

Released: 2024-07