nickjw0205 / Improving-ASR-with-LLM-DescriptionLinks
☆18Updated last year
Alternatives and similar repositories for Improving-ASR-with-LLM-Description
Users that are interested in Improving-ASR-with-LLM-Description are comparing it to the libraries listed below
Sorting:
- ☆14Updated last year
- ☆21Updated last year
- ☆13Updated last year
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆21Updated 3 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated 2 years ago
- ☆32Updated last year
- ☆29Updated 2 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 3 months ago
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆14Updated last year
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆60Updated 10 months ago
- Collection of scripts from mHuBERT-147.☆29Updated 9 months ago
- An official implementation of Style-Talker for Spoken Dialogue Generation☆22Updated 7 months ago
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Updated last year
- BLSP-Emo: Towards Empathetic Large Speech-Language Models☆48Updated last year
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆12Updated 5 months ago
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆35Updated last month
- Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Buil…☆29Updated 2 months ago
- Official release of StyleTalk dataset.☆69Updated last year
- ☆37Updated last year
- Collection of works for evaluating (and analyzing) large audio-language models (LALMs)☆33Updated 3 weeks ago
- ☆23Updated 11 months ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆17Updated last year
- Implementation of Google's USM speech model in Pytorch☆30Updated 2 weeks ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆75Updated 10 months ago
- [ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels☆39Updated last year
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆10Updated 11 months ago
- ☆16Updated 4 months ago
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Updated 5 months ago
- Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…☆40Updated 7 months ago
- ☆36Updated 5 months ago