YuanGongND / llm_speech_emotion_challenge
☆12Updated 2 months ago
Related projects: ⓘ
- Transformer-based visually grounded speech models☆19Updated last year
- Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.☆35Updated 6 months ago
- ☆30Updated last year
- ARCH: Audio Representations benCHmark☆25Updated 3 weeks ago
- A list of papers for child ASR☆24Updated 5 months ago
- ☆35Updated 2 years ago
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆12Updated 2 weeks ago
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆47Updated 8 months ago
- Implementation of CTC alignment-based single step non-autoregressive transformer☆11Updated last year
- Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset☆23Updated 2 weeks ago
- A CSRankings-like index for speech researchers☆30Updated last year
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Updated last year
- ☆31Updated 2 weeks ago
- ☆27Updated last year
- ☆23Updated this week
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆14Updated 6 months ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆10Updated 9 months ago
- ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS☆26Updated last year
- A collection of papers related to speech model compression☆24Updated last year
- ☆69Updated this week
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 3 years ago
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆24Updated 9 months ago
- CMU multilingual speech repository☆31Updated 2 years ago
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆21Updated 11 months ago
- ☆11Updated 2 weeks ago
- A repository comprising of code for generation of noisy speech data from clean data using deep learning methods☆12Updated 3 years ago
- ☆22Updated 2 months ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆36Updated 3 weeks ago
- A toolkit dedicate for speech evaluation.☆18Updated last month
- ☆18Updated 3 months ago