Local TTS model
Supertonic 3
Lightning-fast on-device TTS that runs with ONNX Runtime and no cloud call. Around 99M parameters, 31 languages, better reading stability and expression tags like laugh, breath and sigh.
Edge ready
text-to-speech generation
31 languages
OpenRAIL-M
Quality
8.8/10
Speed
9.8/10
Model size
0.2 GB
Voices
Preset voice styles + downloadable custom style embeddings
Can Supertonic 3 run locally?
Supertonic 3 can generate speech locally for private voice workflows. Start with pip install supertonic.
OpenRAIL-M license. Review upstream restrictions before commercial use.
pip install supertonic
Upstream source
realtimelow-latencymultilingualemotion
Audio profile
Best fit
Supertonic 3 is best for fast on-device voice responses and local assistants.
Hardware: cpugpuappleedge
Model details
Type
Local TTS model
Family
supertonic
Latency
ultra-low
Formats
onnx
Languages
en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi
Context
Open-weight ONNX model, 31 languages, practical local inference
Install locally
01
Check runtimeConfirm the backend supports onnx on your machine.02
Install modelUse the upstream command or repository instructions.03
Test locallyRun a short private audio prompt before moving into production workflows.pip install supertonic
Good for
- text-to-speech generation
- Edge ready local workflows
- realtime, low-latency, multilingual
Watch before shipping
- Validate pronunciation, latency and artifacts with your own voice samples.
- Review the upstream license and acceptable-use notes.
- Benchmark on your target CPU, Apple Silicon or GPU setup.
Related TTS and speech models
hexgrad
Kokoro TTS
Local TTS model · Q 9.2 · Speed 9.8
Alibaba Cloud (Qwen Team)
Qwen3 TTS
Local TTS model · Q 9.5 · Speed 8.5
Kyutai
Moshi
Local TTS model · Q 9 · Speed 9.5
Alibaba FunAudioLLM
CosyVoice 2
Local TTS model · Q 9.3 · Speed 8.8
OpenBMB
VoxCPM2
Local TTS model · Q 9.4 · Speed 8.3
OpenMOSS / MOSI.AI
MOSS-TTS-Nano
Local TTS model · Q 8.5 · Speed 9.7
Speech Research (SWivid)
F5-TTS v1.1
Local TTS model · Q 9.5 · Speed 9.2
Zyphra
Zonos v0.1
Local TTS model · Q 9.5 · Speed 8.5