rooms-solutions / csm-multilingualLinks
Multilingual extension of the SesameAILabs Conversational Speech Generation Model
☆27Updated 5 months ago
Alternatives and similar repositories for csm-multilingual
Users that are interested in csm-multilingual are comparing it to the libraries listed below
Sorting:
- ☆246Updated 2 weeks ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆273Updated 4 months ago
- This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming archi…☆189Updated 4 months ago
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆613Updated 5 months ago
- Streaming and Fine-tuning for Chatterbox TTS☆178Updated 3 months ago
- ☆228Updated 3 months ago
- Simulates talk with an AI that can express emotions☆78Updated 2 months ago
- ☆278Updated last month
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆74Updated this week
- ☆294Updated 2 months ago
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers☆57Updated 3 months ago
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆159Updated last year
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆282Updated 3 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆205Updated 4 months ago
- Open source inference code for Rev's model☆428Updated 4 months ago
- ☆516Updated 3 weeks ago
- Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. F…☆236Updated 5 months ago
- OpenAI compatible TTS for Sesame CSM:1b & dia:1.6b - Voice Cloning from File/YT☆404Updated last month
- Realtime demo, Streaming and Finetuning code for CSM☆394Updated this week
- List of curated use cases built using Sesame's CSM 1B☆73Updated 3 months ago
- Real-time Speech-Text Foundation Model Toolkit (wip)☆245Updated 5 months ago
- Joint speech-language model - respond directly to audio!☆372Updated last year
- ☆457Updated 3 months ago
- ☆343Updated last year
- A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.☆136Updated last year
- ☆632Updated last month
- G2P☆316Updated last month
- Have a natural voice conversation with an LLM☆255Updated 9 months ago
- Cascading voice assistant combining real-time speech recognition, AI reasoning, and neural text-to-speech capabilities.☆118Updated last week
- Fine Tune the Style-TTS2 Voice Model☆251Updated 2 months ago