Ornith 1.0 Is Out: Which Version Can You Run Locally?

Short answer: Ornith 1.0 is worth listing because it is not just a datacenter monster. The 9B GGUF is the small-machine entry point, the 35B GGUF is the serious local coding target, and the 397B release is a server-grade reference model for teams with multi-GPU infrastructure.

Small local

35B

32GB+ target

397B

Server grade

MIT

License

What Ornith 1.0 actually is

Ornith 1.0 is a family of open-source models from DeepReinforce focused on agentic coding: terminal tasks, repository edits, software engineering benchmarks, tool calls and long multi-step workflows. The model card describes the family as self-improving and lists 9B-dense, 31B-dense, 35B-MoE and 397B-MoE releases trained or post-trained on top of Gemma 4 and Qwen 3.5 lineages.

That matters because most "new model" announcements hide the practical detail users need. A 397B model can look impressive in a benchmark table and still be irrelevant to a MacBook owner. Ornith is different because it ships multiple sizes, including GGUF packages for the 9B and 35B variants.

The three Ornith versions that matter

Version	Best local fit	What it is for	LocalClaw verdict
Ornith 1.0 9B GGUF	8GB to 16GB machines	Fast local testing, lightweight coding agents, OpenClaw experiments.	Most accessible Ornith. Start here if you want to see the family behavior without buying hardware.
Ornith 1.0 35B GGUF	32GB+ machines	Agentic coding on stronger desktops, Mac Studio class machines, RTX systems with enough memory headroom.	The most interesting practical release. The Q4_K_M file is listed around 21.2GB, so it belongs in the same local tier as serious 30B-class coding models.
Ornith 1.0 397B	Multi-GPU server	Frontier agentic coding research, vLLM/SGLang serving, benchmark comparison.	Useful for SEO, research and comparison. Not a normal desktop recommendation.

Why the 35B GGUF is the real headline

The 397B model gets attention because the number is huge. But for LocalClaw users, the 35B GGUF release is the more important signal. Hugging Face lists it as GGUF, MIT licensed, with local app support including llama.cpp and LM Studio links. Its hardware table shows Q4_K_M at about 21.2GB, Q5_K_M at about 24.7GB, Q6_K at about 28.5GB, Q8_0 at about 36.9GB and BF16 at about 69.4GB.

In practice, that means Q4_K_M is a 32GB+ target. A 24GB GPU may be tight depending on context length, KV cache, runtime overhead and whether the model spills to system RAM. A 48GB MacBook Pro, Mac Studio, high-memory Apple Silicon machine or serious NVIDIA desktop is a much cleaner fit.

How Ornith compares with the models people already care about

Ornith is not trying to be the best friendly general chat model. It is positioned as an agentic coding family. So the right comparison set is Qwen Coder, GLM-5.2, DeepSeek, Gemma 4 and the large Qwen 3.5/3.7 line.

Model	Where it beats Ornith	Where Ornith is interesting	Best choice if...
Qwen 3 Coder 30B	Mature Qwen ecosystem, strong tooling, native 256K context, broad app support.	Ornith 35B is directly optimized and benchmarked for agentic coding tasks.	You want the safer, more proven coding model today.
GLM-5.2	Massive frontier model with 1M context and strong long-horizon coding positioning.	Ornith 35B is vastly more practical; Ornith 397B competes in the frontier benchmark class.	You have huge memory and want maximum model ceiling.
DeepSeek V3.2 Exp	DeepSeek ecosystem, sparse-attention research direction, strong reasoning/coding reputation.	Ornith is more explicitly packaged around coding-agent behavior and tool-call serving.	You want a frontier research reference rather than a desktop GGUF.
Gemma 4 12B	Cleaner general local model, multimodal angle, easier 16GB fit.	Ornith is more specialized for coding agents and terminal workflows.	You want general assistant quality on normal local hardware.
Qwen 3.5 397B A17B	Broader Qwen flagship ecosystem and general-purpose frontier positioning.	Ornith 397B is a direct coding-agent challenger with MIT licensing and OpenAI-compatible serving notes.	You are evaluating frontier open MoEs, not normal desktop use.

Which Ornith should you install?

8GB to 16GB RAM: Ornith 9B, but test against Qwen and Gemma

The 9B GGUF is the only Ornith variant that belongs in small-machine conversations. It is worth trying if your workflow is coding-heavy and you want to test agentic behavior. For general chat or writing, compare it against Qwen 3 8B, Gemma 4 12B on 16GB machines, and smaller Phi-style models.

32GB to 48GB RAM: Ornith 35B is the one to watch

This is the strongest LocalClaw recommendation for the family. The 35B GGUF release has the right combination: not tiny, not datacenter-only, explicitly coding-agent oriented, and packaged for local runtimes. On 32GB, keep context conservative. On 48GB+, you have more room for the OS, runtime, KV cache and longer prompts.

128GB+ and multi-GPU servers: Ornith 397B is a benchmark target

The 397B model card documents vLLM and SGLang serving recipes. It explicitly describes an OpenAI-compatible server on a single 8x80GB GPU node with tensor parallelism. That is not consumer local AI. It is local-sovereign infrastructure.

Benchmark claims: useful, but do not over-trust them

DeepReinforce reports strong agentic-coding results for Ornith, including Terminal-Bench, SWE-bench, NL2Repo, ClawEval and SWE Atlas style evaluations. That is exactly the right benchmark neighborhood for this model family. But benchmark tables from model cards should be treated as starting evidence, not final truth.

The practical test is simple: run your real repository tasks. Ask it to inspect files, modify code, call tools, keep state across a multi-step debugging session and recover when the first attempt fails. If Ornith handles that better than Qwen Coder or GLM on your machine, it deserves a permanent slot.

LocalClaw verdict

Ornith is absolutely worth adding to a local AI catalogue, but only if the catalogue separates the versions. A single "Ornith 397B" entry makes the family look irrelevant for normal users. The better framing is:

Ornith 9B GGUF: local entry point for small machines and quick coding-agent tests.
Ornith 35B GGUF: the main practical release for serious local coding on 32GB+ hardware.
Ornith 397B: server-grade research and frontier comparison.

If you already use LocalClaw, the practical next step is to compare Ornith 35B GGUF against Qwen 3 Coder 30B, Qwen 3 32B and DeepSeek R1 Distill 32B. That is the comparison that will tell you whether Ornith is a real upgrade for your local coding workflow.