Local TTS model

Sesame CSM

Q: Can Sesame CSM run locally?

Sesame CSM is listed by LocalClaw as a local TTS option. Hardware fit depends on runtime, model size and backend support.

Conversational Speech Model - generates speech with natural turn-taking, backchannels and interruptions. Built specifically for multi-turn dialogue with real-time response generation.

GPU recommended text-to-speech generation 1 languages Apache 2.0

Compare TTS models Open source page

Quality

9.5/10

Speed

7.5/10

Model size

3.5 GB

Voices

Built-in conversational voices

Can Sesame CSM run locally?

Sesame CSM can generate speech locally for private voice workflows. Start with pip install sesame-csm.

Apache 2.0 license. Still verify upstream usage notes before shipping.

pip install sesame-csm Upstream source

dialoguestreamingrealtimeemotion

Audio profile

Quality

9.5

Speed

7.5

Local

8.7

Best fit

Sesame CSM is best for fast on-device voice responses and local assistants.

Hardware: gpuapple

Model details

Type

Local TTS model

Family

sesame

Latency

low

Formats

pytorchsafetensors

Languages

Context

Turn-taking, backchannels, interruptions

Install locally

Check runtimeConfirm the backend supports pytorch, safetensors on your machine.

Install modelUse the upstream command or repository instructions.

Test locallyRun a short private audio prompt before moving into production workflows.

pip install sesame-csm

Good for

text-to-speech generation
GPU recommended local workflows
dialogue, streaming, realtime

Watch before shipping

Validate pronunciation, latency and artifacts with your own voice samples.
Review the upstream license and acceptable-use notes.
Benchmark on your target CPU, Apple Silicon or GPU setup.

Related TTS and speech models

CompareBrowse all TTS models Local AIBrowse LLM models macOS appGet LocalClaw