umbertocappellazzo / Llama-AVSR
View external linksLinks

Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners" [ICASSP 2025] and "Mitigating Attention Sinks and Massive Activations in Audio-Visual Speech Recognition with LLMs" [ICASSP 2026].
56Jan 18, 2026Updated 3 weeks ago

Alternatives and similar repositories for Llama-AVSR

Users that are interested in Llama-AVSR are comparing it to the libraries listed below

Sorting:

Are these results useful?