sungnyun / avsr-temporal-dynamics
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
☆11Updated 6 months ago
Alternatives and similar repositories for avsr-temporal-dynamics:
Users that are interested in avsr-temporal-dynamics are comparing it to the libraries listed below
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆39Updated 8 months ago
- A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification☆31Updated 3 years ago
- Transformer-based visually grounded speech models☆19Updated 2 years ago
- ☆19Updated 2 years ago
- An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"☆31Updated last year
- ☆17Updated last year
- US-based professors who work on audio. For students who would like to apply for RA, PhD, postdoc in audio research.☆25Updated last month
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12Updated 11 months ago
- ☆36Updated 2 years ago
- ☆30Updated 5 months ago
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection