NKU-HLT / DiffEditor
☆13Updated 2 months ago
Alternatives and similar repositories for DiffEditor:
Users that are interested in DiffEditor are comparing it to the libraries listed below
- Retrieval-Augmented MOS Prediction with Prior Knowledge Integration☆25Updated last month
- [ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels☆35Updated last year
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆26Updated last month
- ☆35Updated 3 weeks ago
- Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).☆25Updated 6 months ago
- Speech Human Evaluation Estimation Toolkit (SHEET)☆65Updated 5 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆126Updated 4 months ago
- A Singing Style Conversion Framework Based On Audio Infilling☆20Updated last month
- ☆48Updated 3 weeks ago
- ☆70Updated last year
- ☆17Updated last year
- It includes papers on speech&audio field. Now update: ICLR2023-2025, ICML2023-2024, NeurIPS2023-2024, ACMMM2024, AAAI2024, ACL2024, EMNLP…☆49Updated this week
- ☆28Updated 11 months ago
- ☆48Updated 7 months ago
- Sound Event Detection (SED) paper collection☆14Updated 10 months ago
- This is official repository of new SOTA diffusion models based method for speech enhancement☆39Updated 8 months ago
- ☆51Updated 5 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆58Updated 5 months ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Updated last year
- Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995☆76Updated 4 months ago
- This repository follows papers and reports on discrete speech representation learning and speech tokenization methods for speech language…☆15Updated last year
- ☆13Updated last year
- ☆65Updated last year
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆27Updated 3 weeks ago
- Official implementation of DNSMOS Pro (accepted at INTERSPEECH 2024).☆31Updated last month
- VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.☆51Updated 11 months ago
- A toolkit dedicate for speech evaluation.☆19Updated 7 months ago
- ☆25Updated last year
- Pytorch implementation of Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Pro…☆23Updated last year
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆74Updated 5 months ago