nytopop / csm
A Conversational Speech Generation Model
☆11Updated 2 weeks ago
Alternatives and similar repositories for csm:
Users that are interested in csm are comparing it to the libraries listed below
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆61Updated 3 weeks ago
- ☆21Updated 2 weeks ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆32Updated this week
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆27Updated 5 months ago
- Open TTS models, built for streaming on the edge☆39Updated 2 weeks ago
- G2P☆182Updated this week
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆53Updated 3 months ago
- Joint speech-language model - respond directly to audio!☆30Updated 10 months ago
- Simulates talk with an AI that can express emotions☆61Updated 8 months ago
- Sesame Converse - Real Time Conversations - Powered by Gemma 3☆58Updated 2 weeks ago
- Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and…☆60Updated this week
- Audio tokenization, in the fastest way possible!☆49Updated 7 months ago
- entropix style sampling + GUI☆25Updated 5 months ago
- Video+code lecture on building nanoGPT from scratch☆66Updated 9 months ago
- ☆62Updated 8 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 5 months ago
- Collection of Open Source Speech Data☆152Updated 4 months ago
- Orpheus Chat WebUI☆32Updated this week
- Create an LJSpeech structured voice dataset on wave input☆27Updated 6 months ago
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆34Updated 3 months ago
- Lightweight Gradio based WebUI for orpheusTTS - WSL / Linux [CUDA]☆62Updated last week
- Deploy Apollo HF space locally☆40Updated 3 months ago
- realtime conversational dynamics☆18Updated 2 weeks ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆157Updated last week
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 10 months ago
- A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private co…☆78Updated this week
- python bindings for symphonia/opus - read various audio formats from python and write opus files☆54Updated last week
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆57Updated 11 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in MLX☆20Updated 5 months ago
- run ollama & gguf easily with a single command☆50Updated 10 months ago