ASLP-lab / WenetSpeech-ChuanLinks
Official repository for the WenetSpeech-Chuan dataset.
☆138Updated 2 months ago
Alternatives and similar repositories for WenetSpeech-Chuan
Users that are interested in WenetSpeech-Chuan are comparing it to the libraries listed below
Sorting:
- Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation☆129Updated last week
- LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement☆45Updated 10 months ago
- Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.☆63Updated 4 months ago
- The baselines of ARC-Challenge-Interspeech2026☆56Updated last month
- Official Repository of Paper: "Emilia-NV: A Non-Verbal Speech Dataset with Word-Level Annotation for Human-Like Speech Modeling"☆83Updated 4 months ago
- A Massive Contextual Speech Recognition Benchmark.☆99Updated 5 months ago
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Updated last year
- ☆78Updated 7 months ago
- A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.☆104Updated 8 months ago
- The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…☆49Updated 8 months ago
- LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement☆94Updated 9 months ago
- ☆78Updated 5 months ago
- ☆30Updated last week
- AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data☆34Updated 2 years ago
- An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.☆197Updated last week
- CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!☆112Updated 5 months ago
- A Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows☆213Updated 3 weeks ago
- TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization☆103Updated last year
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆79Updated 2 months ago
- Exploring Binary Classification Loss for Speaker Verification☆18Updated 2 years ago
- Xmart青年论坛仓库,存放历史学生论坛和前沿讲座的视频回放和讲义,获取最新Xmart预告欢迎关注公众号【XLANCE Lab】☆39Updated last month
- SpeechJudge: Towards Human-Level Judgment for Speech Naturalness (https://arxiv.org/abs/2511.07931)☆49Updated last month
- Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".☆85Updated last week
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆41Updated 10 months ago
- ☆110Updated 4 months ago
- ICASSP2026 HumDial Challenge☆31Updated last month
- ☆59Updated 3 months ago
- In-car multi-channel speech transcription system of AISHELL-5.☆39Updated 7 months ago
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆55Updated 9 months ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆95Updated last year