nytopop / csm
A Conversational Speech Generation Model
☆12Updated 2 months ago
Alternatives and similar repositories for csm
Users that are interested in csm are comparing it to the libraries listed below
Sorting:
- Sesame Converse - Real Time Conversations - Powered by Gemma 3☆61Updated last month
- ☆21Updated last month
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers☆54Updated last month
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated last month
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆27Updated 7 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆35Updated 3 weeks ago
- Open TTS models, built for streaming on the edge☆41Updated 2 months ago
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆52Updated 5 months ago
- Lightweight Gradio based WebUI for orpheusTTS - WSL / Linux [CUDA]☆93Updated last month
- Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and…☆73Updated last month
- Collection of Open Source Speech Data☆157Updated 6 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆63Updated last week
- Finetune Sesame's CSM 1B model, for fun and profit☆16Updated last month
- ☆156Updated last week
- Examples of using the llasa-tts models locally☆172Updated 3 weeks ago
- ☆214Updated last month
- Extract voice segments of a target speaker from podcasts - Useful for creating speech datasets☆41Updated this week
- A functioning Sesame CSM project with a desktop GUI - Real-time factor: 0.6x with 4070 Ti Super - Requires only 8GB VRAM☆31Updated 2 weeks ago
- Win & Liunux Gradio WebUI for CSM-1B model by sesame☆43Updated 2 months ago
- Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. F…☆143Updated last month
- A lightweight Python library for running TTS models with a unified API.☆18Updated 2 months ago
- realtime conversational dynamics☆19Updated last month
- StyleTTS 2 Optimized Training Fork☆28Updated 3 months ago
- Realtime demo, Streaming and Finetuning code for CSM☆294Updated this week
- B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.☆26Updated 11 months ago
- Finetune Sesame AI's conversational speech model on new languages and voices. Blog post: https://blog.speechmatics.com/sesame-finetune☆37Updated this week
- Create an LJSpeech structured voice dataset on wave input☆29Updated 7 months ago
- ☆96Updated last year
- Audio tokenization, in the fastest way possible!☆52Updated 8 months ago
- Your personal and private AI☆45Updated last month