Open-weight local LLM
DANTE-Mosaic-3.5B
OdaxAI compact dense model based on SmolLM3-3B and distilled from Kimi K2. Strong small-model benchmark profile: GSM8K 74.45, HellaSwag 76.73 and MBPP 42.6. Apache 2.0, BF16 weights, practical for local Transformers/vLLM use.
Laptop ready
8 GB RAM
BF16
Local chat on 8GB+ machines
Parameters
3.08B
Minimum RAM
8 GB
Model size
6.2 GB
Quantization
BF16
Can DANTE-Mosaic-3.5B run locally?
DANTE-Mosaic-3.5B is a good fit for normal laptops and compact desktops with 8 GB RAM or more.
Search for OdaxAI/DANTE-Mosaic-3.5B in LM Studio or another GGUF-compatible runtime.
OdaxAI/DANTE-Mosaic-3.5Bchatreasoningcodelightmultilingual
Install path
01
Check RAM fitMinimum 8 GB RAM. Start with the BF16 quant.02
Load the modelSearch OdaxAI/DANTE-Mosaic-3.5B in LM Studio.03
Control locallyUse LocalClaw to manage models, agents, chat, channels and scheduled OpenClaw work.Strengths
- Compact 3.08B dense model that can run on modest local hardware
- Apache 2.0 license with open weights, scripts, configs and evaluation assets
- Distilled from Kimi K2 while retaining a practical small-model footprint
- Strong reported small-model results: 74.45 GSM8K, 76.73 HellaSwag and 42.6 MBPP
- Runs from standard Hugging Face Transformers and can be served locally with vLLM/SGLang-style stacks
- Good candidate for laptop-friendly reasoning and coding experiments
Limitations
- No official GGUF quantization in the main repository at listing time
- BF16 weights are larger than a 3B Q4 GGUF would be
- Not a frontier model; quality is bounded by small dense-model capacity
- Context window is not clearly documented in the model card
Best use cases
- Local chat on 8GB+ machines
- Small-model reasoning experiments
- Light coding help and MBPP-style programming tasks
- Research on knowledge distillation from large MoE teachers
- Multilingual assistants with a small memory footprint