Local LLM model page

Mistral Nemo (12B)

Mistral x NVIDIA 128K context model. Excellent for long documents and conversations. 2.7M downloads.

Find the best model for my hardware Browse all 183 LLMs

Parameters

12B

Minimum RAM

12 GB

Model size

7.1 GB

Quantization

Q5_K_M

Can Mistral Nemo (12B) run locally?

Mistral Nemo (12B) is best suited for mainstream Macs and PCs with 16 GB RAM. LocalClaw recommends Q5_K_M as the default quantization, with at least 12 GB RAM.

Search term for LM Studio or compatible runtimes: mistral-nemo-instruct

Hugging Face repository: lmstudio-community/Mistral-Nemo-Instruct-2407-GGUF

chatgeneralstandard

Strengths

128K context
Co-developed with NVIDIA
11 languages
Apache 2.0
Great reasoning

Limitations

Superseded by Mistral Small 3
Needs 12GB RAM

Best use cases

Multilingual applications
Long document processing
RAG
Coding

Benchmarks

Speed: 7/10

Quality: 8/10

Coding: 7/10

Reasoning: 7/10

Technical details

Developer: Mistral AI × NVIDIA

License: Apache 2.0

Context window: 131,072 tokens

Architecture: Transformer with 128K context

Released: 2024-07

Similar models

gemma3-12b qwen2.5-14b phi4-14b