Local LLM model page

Hermes 4 (70B)

Nous Research fine-tune with hybrid reasoning mode. Built on Llama 3.1 70B, aligned for steerability and neutral stance. Top-tier open RP/agent model. Llama 3.1 Community License.

Parameters
70B
Minimum RAM
48 GB
Model size
42 GB
Quantization
Q4_K_M

Can Hermes 4 (70B) run locally?

Hermes 4 (70B) is best suited for high-end workstations with 64 GB RAM. LocalClaw recommends Q4_K_M as the default quantization, with at least 48 GB RAM.

Search term for LM Studio or compatible runtimes: hermes-4-70b

Hugging Face repository: NousResearch/Hermes-4-Llama-3.1-70B-GGUF

chatreasoningpowerqualitygeneral

Strengths

  • Nous Research fine-tune with hybrid reasoning mode. Built on Llama 3.1 70B, aligned for steerability and neutral stance. Top-tier open RP/agent model. Llama 3.1 Community License.

Limitations

  • Performance depends heavily on quantization, RAM bandwidth and runtime support.

Best use cases

  • chat
  • reasoning
  • power
  • quality
  • general

Benchmarks

Speed: 3/10

Quality: 9/10

Coding: 8/10

Reasoning: 9/10

Technical details

Developer: hermes

License: See model repository

Context window: Unknown tokens

Architecture: See model card

Released: 2025-09