thomasgauthier / csm-hfLinks
Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers
β57Updated 5 months ago
Alternatives and similar repositories for csm-hf
Users that are interested in csm-hf are comparing it to the libraries listed below
Sorting:
- ποΈ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets β¨β127Updated 2 months ago
- Streaming and Fine-tuning for Chatterbox TTSβ204Updated 4 months ago
- Unofficial WIP LoRa Finetuning repository for VibeVoiceβ238Updated last month
- β284Updated 3 months ago
- β319Updated 3 weeks ago
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latencyβ159Updated this week
- β234Updated 5 months ago
- β280Updated 2 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β68Updated last week
- Open TTS models, built for streaming on the edgeβ43Updated 7 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.β43Updated last month
- BeltOut: An open source pitch-perfect voice-to-voice timbre transfer model based on ChatterboxVCβ78Updated 3 months ago
- A random walk voice style cloning application for Kokoro text to speechβ152Updated 4 months ago
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.β24Updated 7 months ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLMβ287Updated 5 months ago
- β50Updated last week
- SoTA open-source TTSβ103Updated 4 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.β213Updated 6 months ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolateβ292Updated 4 months ago
- Real-time Speech-Text Foundation Model Toolkit (wip)β247Updated 7 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.β125Updated 3 months ago
- SoTA open-source TTSβ112Updated last week
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,β¦β80Updated last year
- β186Updated 2 weeks ago
- Examples of using the llasa-tts models locallyβ181Updated 6 months ago
- Very fast, accurate speaker diarizationβ158Updated this week
- Official implementation of the TTS model Lina-Speechβ170Updated 9 months ago
- This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaDβ¦β185Updated last month
- Sesame Converse - Real Time Conversations - Powered by Gemma 3β64Updated 7 months ago
- Finetune Sesame's CSM 1B model, for fun and profitβ17Updated 7 months ago