jakariaemon / WSILinks
Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.
☆18Updated 3 months ago
Alternatives and similar repositories for WSI
Users that are interested in WSI are comparing it to the libraries listed below
Sorting:
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆24Updated last month
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated 3 weeks ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆27Updated 4 months ago
- StyleTTS 2 Optimized Training Fork☆31Updated 4 months ago
- StyleTTS2 + Vocos as a Decoder☆12Updated 3 months ago
- Open TTS models, built for streaming on the edge☆43Updated 3 months ago
- The Vokan Architecture (Tsukasa speech based)☆10Updated 4 months ago
- High quality text-to-speech based on StyleTTS 2.☆51Updated last week
- ☆50Updated 2 months ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆90Updated last month
- ☆35Updated last year
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated last month
- ☆26Updated 7 months ago
- ☆33Updated 2 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated last year
- Official code for "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"☆85Updated 3 weeks ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated last year
- ☆13Updated 10 months ago
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆38Updated this week
- A collection of all our phonemeizers for dataset construction and inference☆24Updated 4 months ago
- TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching☆60Updated 2 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆17Updated 7 months ago
- ☆24Updated last month
- Official Code for ParrotTTS☆51Updated 8 months ago
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆19Updated 4 months ago
- Codebase and project page for EDMSound☆34Updated last year
- ☆15Updated 2 months ago
- ☆62Updated 11 months ago
- ☆60Updated last year
- An open-source Kazakh Emotional Text-to-Speech Dataset☆30Updated last year