NKU-HLT / RAMP_MOS
Retrieval-Augmented MOS Prediction with Prior Knowledge Integration
☆13Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for RAMP_MOS
- ☆139Updated 4 months ago
- Paper List☆18Updated last month
- Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.☆109Updated last week
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆118Updated 3 weeks ago
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆166Updated 7 months ago
- Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.☆201Updated 10 months ago
- unofficial implementation of the High Fidelity Neural Audio Compression☆136Updated 3 months ago
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆160Updated 4 months ago
- Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)☆289Updated this week
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆42Updated 4 months ago
- BLSP: Bootstrapping Langauge-Speech Pre-training via Behavior Alignment of Continuation Writing☆45Updated 8 months ago
- This is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".☆40Updated last month
- [ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels☆28Updated 8 months ago
- [IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer☆115Updated 7 months ago
- Real-time Speech-Text Foundation Model Toolkit (wip)☆122Updated last month
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆139Updated last year
- Audio Codec Speech processing Universal PERformance Benchmark☆220Updated 3 weeks ago
- Training code for FAcodec presented in NaturalSpeech3☆178Updated 2 months ago
- ☆17Updated 8 months ago
- UT-Sarulab MOS prediction system using SSL models☆188Updated 7 months ago
- [INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation☆35Updated last year
- Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆104Updated last month
- [INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark☆146Updated 5 months ago
- ☆55Updated 11 months ago
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…☆480Updated 5 months ago
- The open source code for LLM-Codec☆114Updated 3 months ago
- It's a repository for implementations of neural speech editing algorithms.☆191Updated 10 months ago
- ☆25Updated 2 years ago
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆16Updated 3 months ago
- ☆43Updated last year