gooofy / zerovoxLinks
zero-shot realtime TTS system, fully offline, free and open source
☆41Updated 2 months ago
Alternatives and similar repositories for zerovox
Users that are interested in zerovox are comparing it to the libraries listed below
Sorting:
- An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".☆69Updated 2 months ago
- (WIP) A retrain of F5-TTS on permissively-licensed data☆11Updated 2 months ago
- Simple and lightweight Zero-shot Text-to-Speech (TTS) synthesis model☆25Updated last month
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆21Updated last month
- AudioSR-Upsampling (any -> 48kHz)☆41Updated last year
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆84Updated last month
- StyleTTS 2 Optimized Training Fork☆31Updated 4 months ago
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆21Updated 9 months ago
- StyleTTS2 + Vocos as a Decoder☆12Updated 2 months ago
- High quality text-to-speech based on StyleTTS 2.☆51Updated last week
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆98Updated 8 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated 3 weeks ago
- ☆40Updated 4 months ago
- Official code for "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"☆82Updated 2 weeks ago
- Real-time end-to-end singing voice convertion☆22Updated 7 months ago
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆34Updated last year
- ☆29Updated last year
- [TAFFC 2025] The official implementation of EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vec…☆94Updated 2 months ago
- TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching☆60Updated 2 months ago
- Pytorch implementation of SoundCTM☆96Updated 2 months ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆29Updated last year
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆18Updated 3 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆38Updated this week
- A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS☆42Updated 6 months ago
- ☆24Updated last month
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆39Updated 6 months ago
- Unofficial implementation of wavenext vocoder☆47Updated 9 months ago
- Export an ONNX graph that performs ISTFT. Designed for TTS models.☆24Updated last year
- Llasa Speed Up☆35Updated 3 weeks ago
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆35Updated 7 months ago