soham97 / PAM
PAM is a no-reference audio quality metric for audio generation tasks
☆42Updated 2 months ago
Related projects: ⓘ
- ☆41Updated 2 months ago
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆44Updated last month
- ☆44Updated last week
- ☆37Updated 3 months ago
- [InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter☆70Updated 2 months ago
- The open source code for SimpleSpeech series☆85Updated last month
- Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.☆66Updated 3 weeks ago
- ☆44Updated 9 months ago
- High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec☆63Updated 3 months ago
- ☆26Updated 3 months ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆30Updated 7 months ago
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆81Updated last month
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…☆33Updated 9 months ago
- Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint☆54Updated last year
- Please visit https://thuhcsi.github.io/SnakeGAN/☆36Updated last year
- Unofficial implementation of NANSY++ in Pytorch Lightning☆46Updated 6 months ago
- ☆20Updated 4 months ago
- NOMAD is a fully unsupervised non-matching reference audio quality metric☆23Updated 3 months ago
- Query-conditioned target sound extraction model☆14Updated 3 months ago
- This is the implementation our Interspeech 2022 paper " Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conv…☆16Updated last year
- Robust Singing Voice Transcription and MIDI Extraction☆47Updated last month
- ☆30Updated last year
- ☆62Updated 8 months ago
- UTokyo-SaruLab MOS Prediction System☆49Updated this week
- ☆45Updated 7 months ago
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆52Updated 3 weeks ago
- Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignment☆62Updated 2 months ago
- ☆60Updated last year
- Source code of APNet2, a vocoder☆49Updated 9 months ago
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆58Updated 5 months ago