Local TTS model

VibeVoice 1.5B

Q: Can VibeVoice 1.5B run locally?

VibeVoice 1.5B is listed by LocalClaw as a local TTS option. Hardware fit depends on runtime, model size and backend support.

Open-source long-form multi-speaker TTS model (up to 90 min, up to 4 speakers). Listed as research-first with responsible-use constraints.

GPU recommended text-to-speech generation 2 languages MIT

Compare TTS models Open source page

Quality

9.4/10

Speed

6.5/10

Model size

5.8 GB

Voices

Up to 4 speakers in long-form dialogue

Can VibeVoice 1.5B run locally?

VibeVoice 1.5B can generate speech locally for private voice workflows. Start with pip install vibevoice && python demo/tts_inference.py.

MIT license. Still verify upstream usage notes before shipping.

pip install vibevoice && python demo/tts_inference.py Upstream source

streamingdialoguemultilingualemotion

Audio profile

Quality

9.4

Speed

6.5

Local

8.0

Best fit

VibeVoice 1.5B is best for multilingual local speech generation.

Hardware: gpuapple

Model details

Type

Local TTS model

Family

vibevoice

Latency

medium

Formats

pytorchsafetensors

Languages

en, zh

Context

Research-first release, watermark + disclosure recommended

Install locally

Check runtimeConfirm the backend supports pytorch, safetensors on your machine.

Install modelUse the upstream command or repository instructions.

Test locallyRun a short private audio prompt before moving into production workflows.

pip install vibevoice && python demo/tts_inference.py

Good for

text-to-speech generation
GPU recommended local workflows
streaming, dialogue, multilingual

Watch before shipping

Validate pronunciation, latency and artifacts with your own voice samples.
Review the upstream license and acceptable-use notes.
Benchmark on your target CPU, Apple Silicon or GPU setup.

Related TTS and speech models

Microsoft Research VibeVoice Realtime 0.5B Local TTS model · Q 9.1 · Speed 9.2 Microsoft Research VibeVoice ASR Local ASR model · Q 9.3 · Speed 7.5 Boson AI Higgs Audio v2 Local TTS model · Q 9.7 · Speed 7 StepFun Step-Audio 2 Mini Local TTS model · Q 9.3 · Speed 7.5 Alibaba Cloud (Qwen Team) Qwen3 TTS Local TTS model · Q 9.5 · Speed 8.5 Kyutai Moshi Local TTS model · Q 9 · Speed 9.5 Alibaba FunAudioLLM CosyVoice 2 Local TTS model · Q 9.3 · Speed 8.8 OpenBMB VoxCPM2 Local TTS model · Q 9.4 · Speed 8.3

CompareBrowse all TTS models Local AIBrowse LLM models macOS appGet LocalClaw