TTS model pages

Local TTS models

Static, indexable pages for open local text-to-speech, voice cloning, ASR and speech AI models.

TTS / speech models
50
Featured
12
Best quality
Higgs Audio v2
Fastest
Parakeet TDT 0.6B v2

Featured TTS models

Alibaba Cloud (Qwen Team)

Qwen3 TTS

Quality 9.5 · Speed 8.5 · 2.8GB · Apache 2.0

State-of-the-art multilingual TTS with natural prosody and emotion control. Supports 30+ languages with streaming inference.

streamingrealtimemultilingualemotion
Rhasspy

Piper

Quality 7.5 · Speed 10 · 0.1GB · MIT

Fast, local neural TTS optimized for Raspberry Pi and edge devices. Lightweight with good quality for embedded systems.

realtimelow-latency
Speech Research

F5-TTS

Quality 9.4 · Speed 9 · 1.5GB · MIT

Flow-matching based TTS with SOTA quality and extremely fast inference. Simple and efficient architecture.

realtimecloningstreaming
hexgrad

Kokoro TTS

Quality 9.2 · Speed 9.8 · 0.33GB · Apache 2.0

Ultra-lightweight yet stunning quality. 82M params only - runs on CPU in real-time. Best quality-to-size ratio of any TTS model.

realtimestreaminglow-latencymultilingual
Canopy Labs

Orpheus TTS

Quality 9.6 · Speed 7.5 · 3.5GB · Apache 2.0

LLM-based TTS with human-level naturalness. Supports rich emotion tags (laugh, sigh, hesitation). Built on Llama 3 architecture for unprecedented expressiveness.

emotionstreamingcloning
Resemble AI

Chatterbox TTS

Quality 9.4 · Speed 8 · 1.2GB · MIT

Open-source SOTA voice cloning from Resemble AI. Outperforms ElevenLabs on naturalness benchmarks. Supports emotion exaggeration control and ultra-stable generation.

cloningemotionstreaming
Nari Labs

Dia

Quality 9.3 · Speed 7 · 3GB · Apache 2.0

1.6B dialogue TTS - generates realistic two-speaker conversations from a single transcript. Supports non-verbal cues like [laughs], [coughs], [sighs] natively.

dialogueemotioncloningstreaming
OuteAI

OuteTTS

Quality 8.7 · Speed 8.5 · 0.9GB · MIT

Pure language model approach to TTS - no separate audio encoder. Runs via llama.cpp for fully local GGUF inference. Excellent for CPU-only setups.

realtimelow-latencycloning
Alibaba FunAudioLLM

CosyVoice 2

Quality 9.3 · Speed 8.8 · 2.4GB · Apache 2.0

Industrial-grade multilingual TTS with streaming, voice cloning and emotion control. Exceptional Chinese + English quality. Used in production at Alibaba scale.

streamingrealtimecloningemotionmultilingual
SparkAudio

Spark TTS

Quality 9 · Speed 8.2 · 3GB · Apache 2.0

Bilingual TTS with virtual speaker creation - control pitch, speed, gender from text. Built on Qwen2.5 LLM backbone for powerful generation.

cloningstreamingrealtime
jamiepine / Community

Voicebox

Quality 9 · Speed 9.5 · 0.05GB · MIT

Desktop app & orchestrator for local TTS - not a model. Provides a UI studio, voice profile management, and a local API. Generates audio via swappable backends (Qwen3 TTS, Kokoro, Piper, XTTS…). Think of it as a front-end shell that runs on top of your installed TTS models.

streamingrealtimelow-latency
Sesame AI

Sesame CSM

Quality 9.5 · Speed 7.5 · 3.5GB · Apache 2.0

Conversational Speech Model - generates speech with natural turn-taking, backchannels and interruptions. Built specifically for multi-turn dialogue with real-time response generation.

dialoguestreamingrealtimeemotion

All TTS model pages

Bark (Suno)
Suno · Quality 8.5 · Speed 4 · MIT
Canary 1B v2
NVIDIA · Quality 9.3 · Speed 9 · CC-BY-4.0
Chatterbox TTS
Resemble AI · Quality 9.4 · Speed 8 · MIT
ChatTTS
2Noise · Quality 8.8 · Speed 7 · AGPL-3.0
Cohere Transcribe 03-2026
Cohere · Quality 9 · Speed 8 · Apache 2.0
Coqui TTS (XTTS v2)
Coqui · Quality 9.2 · Speed 6 · CPML (custom)
CosyVoice 2
Alibaba FunAudioLLM · Quality 9.3 · Speed 8.8 · Apache 2.0
Dia
Nari Labs · Quality 9.3 · Speed 7 · Apache 2.0
Edge TTS
Microsoft (unofficial) · Quality 8 · Speed 9.5 · GPL-3.0
EmotiVoice
NetEase Youdao · Quality 8.5 · Speed 7.5 · Apache 2.0
eSpeak NG
eSpeak Team · Quality 5.5 · Speed 10 · GPL-3.0
F5-TTS
Speech Research · Quality 9.4 · Speed 9 · MIT
F5-TTS v1.1
Speech Research (SWivid) · Quality 9.5 · Speed 9.2 · MIT
Fish Speech
Fish Audio · Quality 9 · Speed 8.5 · Apache 2.0
GPT-SoVITS
RVC-Boss · Quality 9.1 · Speed 7 · MIT
Higgs Audio v2
Boson AI · Quality 9.7 · Speed 7 · Apache 2.0
IndexTTS 2
Bilibili · Quality 9.4 · Speed 8 · Apache 2.0
IndicTTS
Ai4Bharat (IIT Madras) · Quality 8 · Speed 8.5 · MIT
Kitten TTS
Rohan Joshi (@ron_joshi) · Quality 7.5 · Speed 10 · Apache 2.0
Kokoro TTS
hexgrad · Quality 9.2 · Speed 9.8 · Apache 2.0
Kyutai STT 2.6B
Kyutai · Quality 9.4 · Speed 9.5 · CC-BY-4.0
LLaSA 3B
HKUST Audio · Quality 9.2 · Speed 7 · CC-BY-NC 4.0
MARS5
Camb.ai · Quality 9 · Speed 7.5 · AGPL-3.0
MaskGCT
Amphion Team · Quality 9.4 · Speed 9 · MIT
MeloTTS
MYShell · Quality 9 · Speed 9 · MIT
MetaVoice-1B
Metavoice Inc. · Quality 8.9 · Speed 6 · Apache 2.0
MMS (Meta)
Meta AI · Quality 7 · Speed 7.5 · CC-BY-NC 4.0
Moshi
Kyutai · Quality 9 · Speed 9.5 · CC-BY-4.0
NeuTTS Air
Neuphonic · Quality 9 · Speed 9.5 · Apache 2.0
OCTAVE 2
Hume AI · Quality 9.4 · Speed 7.5 · Hume Terms (research)
OpenVoice V2
MyShell · Quality 8.9 · Speed 9 · MIT
Orpheus TTS
Canopy Labs · Quality 9.6 · Speed 7.5 · Apache 2.0
OuteTTS
OuteAI · Quality 8.7 · Speed 8.5 · MIT
Parakeet TDT 0.6B v2
NVIDIA · Quality 9.4 · Speed 10 · CC-BY-4.0
Parler TTS
Hugging Face · Quality 8.8 · Speed 7 · Apache 2.0
Piper
Rhasspy · Quality 7.5 · Speed 10 · MIT
Qwen3 TTS
Alibaba Cloud (Qwen Team) · Quality 9.5 · Speed 8.5 · Apache 2.0
Sesame CSM
Sesame AI · Quality 9.5 · Speed 7.5 · Apache 2.0
Spark TTS
SparkAudio · Quality 9 · Speed 8.2 · Apache 2.0
Step-Audio 2 Mini
StepFun · Quality 9.3 · Speed 7.5 · Apache 2.0
StyleTTS 2
Y.L. Ma et al. · Quality 9.3 · Speed 6.5 · MIT
TADA
Hume AI · Quality 9.1 · Speed 7.5 · Apache 2.0
Tortoise TTS
James Betker · Quality 9.1 · Speed 3 · Apache 2.0
VibeVoice 1.5B
Microsoft Research · Quality 9.4 · Speed 6.5 · MIT
VibeVoice ASR
Microsoft Research · Quality 9.3 · Speed 7.5 · MIT
VibeVoice Realtime 0.5B
Microsoft Research · Quality 9.1 · Speed 9.2 · MIT
Voicebox
jamiepine / Community · Quality 9 · Speed 9.5 · MIT
Whisper v3 Turbo
OpenAI · Quality 9.1 · Speed 9.5 · MIT
XTTS v3 (Community)
Coqui Community · Quality 9.1 · Speed 7 · MPL 2.0
Zonos v0.1
Zyphra · Quality 9.5 · Speed 8.5 · Apache 2.0