thomasgauthier / csm-hfLinks
Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers
ā56Updated 2 weeks ago
Alternatives and similar repositories for csm-hf
Users that are interested in csm-hf are comparing it to the libraries listed below
Sorting:
- šļø Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets āØā38Updated 2 weeks ago
- Open TTS models, built for streaming on the edgeā43Updated 2 months ago
- Finetune Sesame's CSM 1B model, for fun and profitā16Updated 2 months ago
- ā226Updated 2 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.ā62Updated this week
- ā173Updated 2 weeks ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.ā36Updated last week
- A TTS model capable of generating ultra-realistic dialogue in one pass.ā88Updated 2 weeks ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLMā254Updated 2 weeks ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.ā164Updated last month
- ā95Updated last year
- Real-time Speech-Text Foundation Model Toolkit (wip)ā228Updated 2 months ago
- Sesame Converse - Real Time Conversations - Powered by Gemma 3ā62Updated 2 months ago
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesisā271Updated 2 months ago
- Recipes to create the synthetic data for the benchmarked TTS systems.ā25Updated 5 months ago
- Examples of using the llasa-tts models locallyā171Updated last month
- VALL-E 2 reproductionā129Updated 10 months ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,ā¦ā73Updated 8 months ago
- An unofficial PyTorch implementation of VALL-Eā87Updated this week
- Multilingual extension of the SesameAILabs Conversational Speech Generation Modelā26Updated 2 months ago
- ā62Updated 10 months ago
- Realtime demo, Streaming and Finetuning code for CSMā307Updated 2 weeks ago
- StyleTTS 2 Optimized Training Forkā29Updated 4 months ago
- ā224Updated this week
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusionā179Updated 8 months ago
- create dataset from list of youtube links easilyā18Updated 2 years ago
- Hanasu is a human-like TTS model based on the multilingual Himitsu V1 transformer-based encoder and VITS architectureā28Updated this week
- zero-shot realtime TTS system, fully offline, free and open sourceā39Updated last month
- Official implementation of the TTS model Lina-Speechā164Updated 4 months ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech Gā¦ā23Updated 2 months ago