rooms-solutions / csm-multilingualLinks

Multilingual extension of the SesameAILabs Conversational Speech Generation Model

☆29

Alternatives and similar repositories for csm-multilingual

Users that are interested in csm-multilingual are comparing it to the libraries listed below

Sorting:

asiff00 / On-Device-Speech-to-Speech-Conversational-AI
This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming archi…
☆210Updated last week
davidbrowne17 / chatterbox-streaming
Streaming and Fine-tuning for Chatterbox TTS
☆226Updated 5 months ago
mbzuai-oryx / LLMVoX
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM
☆291Updated 6 months ago
fluxions-ai / vui
☆635Updated 3 weeks ago
playht / PlayDiffusion
☆530Updated 2 months ago
jasonppy / VoiceStar
VoiceStar: Robust, Duration-controllable TTS that can Extrapolate
☆297Updated 6 months ago
mahimairaja / awesome-csm-1b
List of curated use cases built using Sesame's CSM 1B
☆73Updated 6 months ago
revdotcom / reverb
Open source inference code for Rev's model
☆433Updated 7 months ago
Marvis-Labs / marvis-tts
☆312Updated 3 months ago
KoljaB / WhoSpeaks
Efficient approach to speaker diarization using voice characteristics extraction
☆105Updated 5 months ago
ai-bot-pro / achatbot
An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.
☆88Updated this week
anan235 / dia-multilingual
A TTS model capable of generating ultra-realistic dialogue in one pass.
☆216Updated 7 months ago
coqui-ai / xtts-streaming-server
☆354Updated last year
lalanikarim / webrtc-ai-voice-chat
A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.
☆141Updated last year
KoljaB / LocalEmotionalAIVoiceChat
Simulates talk with an AI that can express emotions
☆82Updated 5 months ago
kyutai-labs / moshi-finetune
☆338Updated 2 months ago
zhenye234 / LLaSA_training
LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
☆634Updated 7 months ago
ictnlp / LLaMA-Omni2
☆251Updated 6 months ago
tincans-ai / gazelle
Joint speech-language model - respond directly to audio!
☆372Updated last year
hexgrad / misaki
G2P
☆365Updated 3 months ago
davidbrowne17 / csm-streaming
Realtime demo, Streaming and Finetuning code for CSM
☆419Updated 2 months ago
maitrix-org / Voila
☆476Updated 7 months ago
MYZY-AI / Muyan-TTS
☆470Updated 6 months ago
phildougherty / sesame_csm_openai
OpenAI compatible TTS for Sesame CSM:1b & dia:1.6b - Voice Cloning from File/YT
☆429Updated 2 months ago
hanifabd / voice-activity-detection-vad-realtime
Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)
☆103Updated 3 months ago
Liquid4All / liquid-audio
Liquid Audio - Speech-to-Speech audio models by Liquid AI
☆285Updated 2 months ago
thomasgauthier / csm-hf
Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers
☆57Updated 6 months ago
kyutai-labs / unmute
Make text LLMs listen and speak
☆1,008Updated 2 weeks ago
luweigen / whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
☆121Updated last year
Lex-au / Vocalis
Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. F…
☆266Updated 7 months ago