ANLGBOY / ICLR-2024-OpenReview-RatingsView external linksLinks
☆31Nov 24, 2023Updated 2 years ago
Alternatives and similar repositories for ICLR-2024-OpenReview-Ratings
Users that are interested in ICLR-2024-OpenReview-Ratings are comparing it to the libraries listed below
Sorting:
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"☆35Oct 23, 2025Updated 3 months ago
- Code for the CVPR2021 workshop paper "Noise Conditional Flow Model for Learning the Super-Resolution Space"☆64Jun 21, 2021Updated 4 years ago
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆61Apr 4, 2024Updated last year
- ☆44Sep 19, 2024Updated last year
- TransferTTS (Zero-Shot learning of VITS)☆100Sep 23, 2022Updated 3 years ago
- ☆22Jul 30, 2025Updated 6 months ago
- ☆13Aug 13, 2023Updated 2 years ago
- ☆51Jul 6, 2023Updated 2 years ago
- ☆167Sep 19, 2024Updated last year
- Objective metrics used in several text-to-speech (TTS) papers.☆52Jun 17, 2025Updated 8 months ago
- Korean phoneme dictionary generator for training Montreal Forced Aligner (MFA)☆13Feb 27, 2021Updated 4 years ago
- Simple tool for speech dataset augmentation for modeling various prosodies.☆14Jan 14, 2021Updated 5 years ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- Official PyTorch implementation of "EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders"☆14Sep 20, 2024Updated last year
- Multi-speaker & Multi-style TTS☆29Jul 3, 2024Updated last year
- BigVGAN with Neural Source-Filter☆56Sep 21, 2023Updated 2 years ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12May 13, 2024Updated last year
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Jun 2, 2023Updated 2 years ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15May 16, 2025Updated 9 months ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64May 30, 2023Updated 2 years ago
- ☆12Jul 6, 2023Updated 2 years ago
- ICASSP 2023 Accepted☆189May 6, 2024Updated last year
- Unofficial pytorch implementation of BigVGAN: A Universal Neural Vocoder with Large-Scale Training☆135Feb 18, 2023Updated 3 years ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆31Aug 30, 2025Updated 5 months ago
- V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in MLLMs☆24Jul 31, 2025Updated 6 months ago
- ☆170Jul 25, 2022Updated 3 years ago
- Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, ac…☆34May 25, 2024Updated last year
- Official implementation of "Avocodo: Generative Adversarial Network for Artifact-Free Vocoder" (AAAI2023)☆154Feb 1, 2023Updated 3 years ago
- Pytorch implementation for “V2C: Visual Voice Cloning”☆33Jan 28, 2023Updated 3 years ago
- ☆39Oct 1, 2023Updated 2 years ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆80Sep 27, 2023Updated 2 years ago
- ☆52Jan 6, 2022Updated 4 years ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆267Jan 13, 2025Updated last year
- A repo that builds text to music datasets from scratch, used in MuseContorlLite [ICML2025]☆27May 20, 2025Updated 8 months ago
- An AR+AR TTS attempt.☆18Jan 13, 2025Updated last year
- Official Code for "Rethinking Diffusion Model in High Dimension"☆24May 20, 2025Updated 8 months ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]☆25Jul 5, 2022Updated 3 years ago
- Project page for "Improving Few-shot Learning for Talking Face System with TTS Data Augmentation" for ICASSP2023☆86Oct 10, 2023Updated 2 years ago
- [ICASSP 2024] TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models☆183Nov 22, 2024Updated last year