kyutai-labs/unmute

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kyutai-labs/unmute)

kyutai-labs / unmute

Make text LLMs listen and speak

☆1,365

Alternatives and similar repositories for unmute

Users that are interested in unmute are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kyutai-labs / delayed-streams-modeling
View on GitHub
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
☆2,980Jan 26, 2026Updated 5 months ago
kyutai-labs / moshi-finetune
View on GitHub
☆473Oct 3, 2025Updated 9 months ago
kyutai-labs / pocket-tts
View on GitHub
A TTS that fits in your CPU (and pocket)
☆7,793Updated this week
kyutai-labs / moshi
View on GitHub
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…
☆10,636May 16, 2026Updated 2 months ago
Marvis-Labs / marvis-tts
View on GitHub
☆365Aug 28, 2025Updated 10 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
canopyai / Orpheus-TTS
View on GitHub
Towards Human-Sounding Speech
☆6,247Dec 5, 2025Updated 7 months ago
fluxions-ai / vui
View on GitHub
Real-time voice assistant — WebRTC streaming, faster-whisper ASR, local LLM, Vui Nano (300M) TTS. OpenAI Realtime API compatible. Voice c…
☆727Jul 9, 2026Updated last week
kyutai-labs / hibiki
View on GitHub
Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…
☆1,486Apr 15, 2025Updated last year
pipecat-ai / pipecat
View on GitHub
Open Source framework for voice and multimodal conversational AI
☆13,593Updated this week
pipecat-ai / smart-turn
View on GitHub
☆1,477Jan 29, 2026Updated 5 months ago
kyutai-labs / moshi-rag
View on GitHub
MoshiRAG is a compact full-duplex speech language model augmented with asynchronous knowledge retrieval to improve factuality without sac…
☆130Apr 28, 2026Updated 2 months ago
resemble-ai / chatterbox
View on GitHub
SoTA open-source TTS
☆25,595Jun 10, 2026Updated last month
huggingface / speech-to-speech
View on GitHub
Build local voice agents with open-source models
☆6,196Updated this week
herimor / voxtream
View on GitHub
VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control
☆244May 30, 2026Updated last month
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
gradium-ai / gradium-py
View on GitHub
Python client for the Gradium Voice AI api.
☆32Jul 10, 2026Updated last week
kyutai-labs / hibiki-zero
View on GitHub
A real-time and multilingual speech translation model
☆262Feb 13, 2026Updated 5 months ago
primepake / dac_vae
View on GitHub
Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder
☆38Aug 30, 2025Updated 10 months ago
ReisCook / VoiceAssistant
View on GitHub
A functioning Sesame CSM project with a desktop GUI - Real-time factor: 0.6x with 4070 Ti Super - Requires only 8GB VRAM
☆81May 19, 2025Updated last year
yangdongchao / RSTnet
View on GitHub
Real-time Speech-Text Foundation Model Toolkit (wip)
☆256Mar 26, 2025Updated last year
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆124Jun 4, 2025Updated last year
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 5 months ago
ysharma3501 / MiraTTS
View on GitHub
A high quality and fast TTS repository
☆517Dec 22, 2025Updated 6 months ago
Liquid4All / liquid-audio
View on GitHub
Liquid Audio - Speech-to-Speech audio models by Liquid AI
☆548Jun 5, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
randombk / chatterbox-vllm
View on GitHub
VLLM Port of the Chatterbox TTS model
☆379Oct 18, 2025Updated 9 months ago
SesameAILabs / csm
View on GitHub
A Conversational Speech Generation Model
☆14,703May 27, 2025Updated last year
ekwek1 / soprano
View on GitHub
Soprano: Instant, Ultra-Realistic Text-to-Speech
☆1,250Jan 15, 2026Updated 6 months ago
nytopop / csm
View on GitHub
A Conversational Speech Generation Model
☆14Mar 16, 2025Updated last year
meituan-longcat / LongCat-Audio-Codec
View on GitHub
LongCat Audio Tokenizer and Detokenizer
☆301May 9, 2026Updated 2 months ago
k2-fsa / ZipVoice
View on GitHub
Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching
☆1,015Dec 2, 2025Updated 7 months ago
KoljaB / RealtimeVoiceChat
View on GitHub
Have a natural, spoken conversation with AI!
☆3,799Jul 11, 2025Updated last year
facebookresearch / dacvae
View on GitHub
DACVAE
☆226Dec 22, 2025Updated 6 months ago
NVIDIA / personaplex
View on GitHub
PersonaPlex code.
☆10,221Mar 2, 2026Updated 4 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
fixie-ai / ultravox
View on GitHub
A fast multimodal LLM for real-time voice
☆4,476Dec 12, 2025Updated 7 months ago
livekit / agents
View on GitHub
A framework for building realtime voice AI agents 🤖🎙️📹
☆11,438Updated this week
edwko / OuteTTS
View on GitHub
Interface for OuteTTS models.
☆1,435Mar 23, 2026Updated 3 months ago
k2-fsa / Flow2GAN
View on GitHub
Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation
☆144Mar 8, 2026Updated 4 months ago
nari-labs / dia
View on GitHub
A TTS model capable of generating ultra-realistic dialogue in one pass.
☆19,341Nov 19, 2025Updated 8 months ago
kyutai-labs / tts_longeval
View on GitHub
☆30Apr 29, 2026Updated 2 months ago
XiaomiMiMo / MiMo-Audio
View on GitHub
MiMo-Audio: Audio Language Models are Few-Shot Learners
☆1,064Jun 17, 2026Updated last month