Open-weight local LLM
LFM2.5-8B-A1B
Liquid AI hybrid model built for on-device assistants. 8.3B total / 1.5B active, 128K context, tool use, GGUF, ONNX, MLX, llama.cpp and LM Studio support. Open-weight under LFM 1.0.
Laptop ready
8 GB RAM
Q4_K_M
On-device personal assistant
Parameters
8.3B (1.5B active)
Minimum RAM
8 GB
Model size
5.2 GB
Quantization
Q4_K_M
Can LFM2.5-8B-A1B run locally?
LFM2.5-8B-A1B is a good fit for normal laptops and compact desktops with 8 GB RAM or more.
Search for lfm2.5-8b-a1b in LM Studio or another GGUF-compatible runtime.
LiquidAI/LFM2.5-8B-A1B-GGUFchatcodereasoningspeedstandardgeneral
Install path
01
Check RAM fitMinimum 8 GB RAM. Start with the Q4_K_M quant.02
Load the modelSearch lfm2.5-8b-a1b in LM Studio.03
Control locallyUse LocalClaw to manage models, agents, chat, channels and scheduled OpenClaw work.Strengths
- Designed specifically for on-device personal assistants and local agent workflows
- Only 1.5B active parameters at inference despite 8.3B total parameters
- 128K context window for long local sessions and document-heavy prompts
- Day-one GGUF, ONNX, MLX, llama.cpp and LM Studio support
- Strong fit for structured outputs, tool use and lightweight agentic tasks
- Runs on mainstream 8-16 GB machines with quantized weights
Limitations
- LFM 1.0 is a custom open-weight license, not Apache 2.0
- Liquid AI notes it is not the best fit for heavy programming or knowledge-heavy QA without retrieval
- Hybrid architecture may need recent runtimes for best performance
- Still a small active-parameter model; larger 20B-30B class models can beat it on raw quality
Best use cases
- On-device personal assistant
- Local OpenClaw agents with tool calls
- Structured output workflows
- Fast multilingual chat on laptops
- Long-context local note and document workflows
- Apple Silicon inference through MLX