This is a winter of code project aimed at speech enhancement of text to speech models.
☆25Feb 6, 2022Updated 4 years ago
Alternatives and similar repositories for woc-tts-enhancement
Users that are interested in woc-tts-enhancement are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆26Feb 22, 2024Updated 2 years ago
- Script to calculate SNR and SDR using python☆93Jul 7, 2020Updated 5 years ago
- ☆19Jun 29, 2025Updated 11 months ago
- [AAAI 2026 & ACL 2026] The official implementation of the DIFFA series for dLLM-based large audio language model☆82Apr 7, 2026Updated 2 months ago
- Baseline kaldi script for UA-SPEECH corpus☆32Oct 16, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- An evaluation toolkit for voice conversion models.☆42Jul 11, 2021Updated 4 years ago
- Unofficial implementation of NANSY++ in Pytorch Lightning☆50Mar 11, 2024Updated 2 years ago
- StyleTTS2 + Vocos as a Decoder☆13Mar 24, 2025Updated last year
- This is the implementation our Interspeech 2022 paper " Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conv…☆21Sep 18, 2023Updated 2 years ago
- Adaptive recursive wideband noise filter using the Recursive Least Squares (RLS) algorithm☆10Mar 5, 2016Updated 10 years ago
- chatterbox TTS + Voice Clone using onnx☆28Jun 13, 2026Updated last week
- Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…☆83Jan 7, 2023Updated 3 years ago
- Prompting Large Language Models with Audio for General-Purpose Speech Summarization☆20May 14, 2025Updated last year
- ☆15Sep 10, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"☆14Feb 13, 2022Updated 4 years ago
- Noise cancellation, suppression☆13Apr 8, 2019Updated 7 years ago
- Audio signals noise reduction☆13Dec 27, 2021Updated 4 years ago
- TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion☆147Jan 15, 2024Updated 2 years ago
- Wiener filter for audio noise reduction☆11Dec 6, 2017Updated 8 years ago
- Unofficial PyTorch Implementation of StarGAN-ZSVC☆14Aug 5, 2021Updated 4 years ago
- ☆101Jan 19, 2026Updated 5 months ago
- ☆11Mar 22, 2023Updated 3 years ago
- [INTERSPEECH 2024] The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for …☆179May 20, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆12Jun 10, 2021Updated 5 years ago
- [CVPR 2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie…☆23Jun 6, 2025Updated last year
- ☆19Aug 23, 2024Updated last year
- [NeurIPS 2025] Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM☆27Feb 10, 2026Updated 4 months ago
- ☆17Mar 25, 2025Updated last year
- ☆20Jul 16, 2023Updated 2 years ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- VI-SVC model is just VITS without MAS and DurationPredictor.☆10Nov 9, 2023Updated 2 years ago
- WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.☆13Sep 27, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated last year
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- Updated folk of g2pk☆13Aug 18, 2023Updated 2 years ago
- Implementation of the subscale framework from the WaveRNN paper, building on top of Fatchord's WaveRNN repo☆19Oct 8, 2020Updated 5 years ago
- ☆28Nov 5, 2021Updated 4 years ago
- Contrastive Bayesian Analysis for Deep Metric Learning and an Integrated Deep Metric Learning Toolbox Based on Pytorch☆13Dec 27, 2022Updated 3 years ago