Local LLM model page
Mistral Nemo (12B)
Mistral x NVIDIA 128K context model. Excellent for long documents and conversations. 2.7M downloads.
Parameters
12B
Minimum RAM
12 GB
Model size
7.1 GB
Quantization
Q5_K_M
Can Mistral Nemo (12B) run locally?
Mistral Nemo (12B) is best suited for mainstream Macs and PCs with 16 GB RAM. LocalClaw recommends Q5_K_M as the default quantization, with at least 12 GB RAM.
Search term for LM Studio or compatible runtimes: mistral-nemo-instruct
Hugging Face repository: lmstudio-community/Mistral-Nemo-Instruct-2407-GGUF
chatgeneralstandard
Strengths
- 128K context
- Co-developed with NVIDIA
- 11 languages
- Apache 2.0
- Great reasoning
Limitations
- Superseded by Mistral Small 3
- Needs 12GB RAM
Best use cases
- Multilingual applications
- Long document processing
- RAG
- Coding
Benchmarks
Speed: 7/10
Quality: 8/10
Coding: 7/10
Reasoning: 7/10
Technical details
Developer: Mistral AI × NVIDIA
License: Apache 2.0
Context window: 131,072 tokens
Architecture: Transformer with 128K context
Released: 2024-07