F5-TTS v1.1
Iterative upgrade over the original F5-TTS. Faster convergence via improved flow-matching schedule, better Chinese prosody, cross-lingual cloning. Now with streaming inference and improved CFM sampler.
Quality-focused TTS with impressive voice cloning. Slower but produces very natural speech.
Tortoise TTS is a local speech model from James Betker. It is best suited for cloning workflows. Check the license before commercial use.
gpu
pytorch
Unlimited cloning
high
Apache 2.0
2022-05
pip install tortoise-ttscloning
Iterative upgrade over the original F5-TTS. Faster convergence via improved flow-matching schedule, better Chinese prosody, cross-lingual cloning. Now with streaming inference and improved CFM sampler.
First super-realistic TTS LLM that runs in real-time on CPU. 748M params, LLaMA 3.2 backbone + NeuCodec audio tokenizer. GGUF-native - perfect for on-device agents and offline apps. Instant 3s voice cloning.
Flow-matching based TTS with SOTA quality and extremely fast inference. Simple and efficient architecture.
Fully non-autoregressive TTS - no text-phone alignment needed. Achieves human parity on naturalness and similarity metrics. Incredibly fast inference.
Industrial-grade multilingual TTS with streaming, voice cloning and emotion control. Exceptional Chinese + English quality. Used in production at Alibaba scale.
High-quality multilingual TTS with extremely natural voice cloning. Best for Chinese and English with fast inference.