☆31Jul 18, 2024Updated last year
Alternatives and similar repositories for e2tts-test-suite
Users that are interested in e2tts-test-suite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"☆14Feb 13, 2022Updated 4 years ago
- ☆18Sep 19, 2023Updated 2 years ago
- ☆18Aug 23, 2024Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 10 months ago
- ☆36Sep 6, 2025Updated 6 months ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆18Sep 13, 2024Updated last year
- Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.☆12Mar 15, 2025Updated last year
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 11 months ago
- Googleの音声復元モデルMiipher-2の再現実装の学習および推論コード。学習済みモデルも公開しています。☆31Feb 7, 2026Updated last month
- ☆37Mar 30, 2021Updated 4 years ago
- ☆13Sep 13, 2023Updated 2 years ago
- [INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"☆64Jun 16, 2025Updated 9 months ago
- Audio Research in US. US-based professors who work on audio (music, speech, acoustics). For students who would like to apply for RA, PhD,…☆27Feb 27, 2026Updated 3 weeks ago
- Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"☆41Jun 28, 2025Updated 8 months ago
- ☆46Jul 7, 2025Updated 8 months ago
- Whisper Speech Quality Assessment (WhiSQA)☆16Oct 14, 2025Updated 5 months ago
- Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).☆25Sep 19, 2025Updated 6 months ago
- Bilingual Singing Voice Synthesis☆18Mar 25, 2024Updated last year
- The demo page for ALMTokenizer☆59Apr 14, 2025Updated 11 months ago
- Neural text to speech system that uses eSpeak as a text/phoneme front-end☆16Oct 20, 2021Updated 4 years ago
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆24Sep 9, 2024Updated last year
- This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).☆14Jun 15, 2021Updated 4 years ago
- DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast…☆53Updated this week
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- [Interspeech 2025] DualCodec: A Low-Frame-Rate, Semantically-Enhanced Neural Audio Codec☆63Mar 11, 2026Updated last week
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆56Jun 1, 2025Updated 9 months ago
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆45Sep 5, 2025Updated 6 months ago
- ☆12Nov 7, 2024Updated last year
- Official implementation of DNSMOS Pro (accepted at INTERSPEECH 2024).☆83Jun 8, 2025Updated 9 months ago
- A unified model for zero-shot singing voice conversion and synthesis☆22Nov 30, 2022Updated 3 years ago
- Source code for training models and using the hyperbolic interface proposed in our ICASSP 2023 paper, “Hyperbolic Audio Source Separation…☆69Apr 27, 2023Updated 2 years ago
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆54May 15, 2025Updated 10 months ago
- Repository for multilingual speech data resources for native languages of Zambia.☆20Oct 9, 2024Updated last year
- ☆11Mar 22, 2023Updated 3 years ago
- ☆12Jun 10, 2021Updated 4 years ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 7 months ago
- ☆80Aug 11, 2025Updated 7 months ago
- EMO-SUPERB submission☆51Oct 13, 2025Updated 5 months ago
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆66Oct 28, 2024Updated last year