DysfluentWFST
☆18Nov 13, 2025Updated 3 months ago
Alternatives and similar repositories for DysfluentWFST
Users that are interested in DysfluentWFST are comparing it to the libraries listed below
Sorting:
- YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection☆20Mar 4, 2025Updated last year
- A Weakly Supervised Forced Alignment for disluent speech☆15Nov 12, 2023Updated 2 years ago
- ☆20Sep 20, 2024Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- Speech enhancement in noisy and reverberant environments using deep neural networks☆22Oct 10, 2025Updated 4 months ago
- Bilingual Singing Voice Synthesis☆18Mar 25, 2024Updated last year
- [CVPR 2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie…☆23Jun 6, 2025Updated 8 months ago
- ☆19Sep 10, 2024Updated last year
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- ☆11Aug 11, 2023Updated 2 years ago
- Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.☆12Mar 15, 2025Updated 11 months ago
- text to speech☆10Mar 19, 2024Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 8 months ago
- ☆11Nov 7, 2024Updated last year
- ☆10Mar 20, 2021Updated 4 years ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- A Model (maybe an app) that translates the audio of a video from one language to another language, cloning the voice of original video wi…☆15May 19, 2025Updated 9 months ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- ☆57May 29, 2025Updated 9 months ago
- Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"☆13Oct 31, 2024Updated last year
- ☆13Jan 5, 2025Updated last year
- ☆14Aug 1, 2025Updated 7 months ago
- ☆15Nov 11, 2024Updated last year
- Cantonese Grapheme-to-Phoneme Converter based on GitYCC/g2pW☆15Dec 10, 2024Updated last year
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 11 months ago
- Java Bindings for the C++ library DeepSpeech☆10Jun 4, 2020Updated 5 years ago
- [ICMR 2025] Official Repository for The Paper, Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale …☆18Aug 17, 2025Updated 6 months ago
- ☆15Nov 10, 2025Updated 3 months ago
- ☆14Jun 16, 2023Updated 2 years ago
- A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes☆11Oct 19, 2023Updated 2 years ago
- SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech☆11Jun 30, 2023Updated 2 years ago
- ☆32Jan 6, 2022Updated 4 years ago
- VITS2 using Phoneme-Level Japanese BERT☆14Dec 17, 2023Updated 2 years ago
- ☆12Feb 9, 2021Updated 5 years ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 9 months ago