Local LLM model page
Hermes 4 (70B)
Nous Research fine-tune with hybrid reasoning mode. Built on Llama 3.1 70B, aligned for steerability and neutral stance. Top-tier open RP/agent model. Llama 3.1 Community License.
Parameters
70B
Minimum RAM
48 GB
Model size
42 GB
Quantization
Q4_K_M
Can Hermes 4 (70B) run locally?
Hermes 4 (70B) is best suited for high-end workstations with 64 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 48 GB RAM.
Search term for LM Studio or compatible runtimes: hermes-4-70b
Hugging Face repository: NousResearch/Hermes-4-Llama-3.1-70B-GGUF
chatreasoningpowerqualitygeneral
Strengths
- Nous Research fine-tune with hybrid reasoning mode. Built on Llama 3.1 70B, aligned for steerability and neutral stance. Top-tier open RP/agent model. Llama 3.1 Community License.
Limitations
- Performance depends heavily on quantization, RAM bandwidth and runtime support.
Best use cases
- chat
- reasoning
- power
- quality
- general
Benchmarks
Speed: 3/10
Quality: 9/10
Coding: 8/10
Reasoning: 9/10
Technical details
Developer: hermes
License: See model repository
Context window: Unknown tokens
Architecture: See model card
Released: 2025-09