Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆19Jul 16, 2024Updated last year
Alternatives and similar repositories for LipGER
Users that are interested in LipGER are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Jul 4, 2024Updated last year
- ☆18Jul 22, 2024Updated last year
- ☆17May 5, 2024Updated 2 years ago
- Code for "Distribution-based Emotion Recognition in Conversation"☆18Feb 6, 2023Updated 3 years ago
- SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge☆12Jun 11, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆209Jul 29, 2025Updated 10 months ago
- Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)☆82Feb 27, 2025Updated last year
- (SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition☆13Oct 22, 2024Updated last year
- ☆12Aug 25, 2023Updated 2 years ago
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆15Jun 23, 2024Updated last year
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- ☆11Oct 24, 2022Updated 3 years ago
- EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction☆271May 19, 2024Updated 2 years ago
- SLT 2024 Challenge: Post-ASR-Speaker-Tagging☆16Jun 16, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Multi-lingual AudioCaps☆14Nov 20, 2023Updated 2 years ago
- Accompanying code for paper "Attention-Based Contextual Language Model Adaptation for Speech Recognition", submitted to ACL 2021.☆14Jul 25, 2023Updated 2 years ago
- [ICASSP 2022] AISHELL-NER: Named Entity Recognition from Chinese Speech☆25Apr 20, 2022Updated 4 years ago
- ViSpeR: Multilingual Audio-Visual Speech Recognition☆58Apr 17, 2025Updated last year
- ☆46Feb 16, 2023Updated 3 years ago
- Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…☆35Jun 20, 2023Updated 2 years ago
- Code for paper "MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recogni…☆16Jun 21, 2023Updated 2 years ago
- ☆25Nov 25, 2025Updated 6 months ago
- Exquisite video generation☆15Feb 18, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆14Dec 9, 2021Updated 4 years ago
- [CCS-LAMPS 2024] Mitigating Unauthorized Speech Synthesis for Voice Protection☆20Nov 1, 2024Updated last year
- wake-up word emotion recognition [APSIPA 2022]☆17Nov 11, 2022Updated 3 years ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 10 months ago
- Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"☆18Jun 21, 2023Updated 2 years ago
- Repository for "LLM-based speaker diarization correction: A generalizable approach" paper☆22Jul 31, 2024Updated last year
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆41Aug 29, 2024Updated last year
- Official code for the paper "Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks".☆16Dec 7, 2021Updated 4 years ago
- [제 11회 투빅스 컨퍼런스] AM I OK ? - 전문의 답변 기반 심리진단 AI☆12Jan 19, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆10Sep 25, 2024Updated last year
- A speech signal processing library in Python with emphasis on deep learning.☆31Apr 13, 2026Updated 2 months ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆102Apr 10, 2025Updated last year
- StutterFormer is an AI model that aims to be able to receive a speech sample with stuttering disfluencies, and return it with the disflue…☆19Feb 10, 2023Updated 3 years ago
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆16Apr 29, 2025Updated last year
- Towards Fine-grained Audio Captioning with Multimodal Contextual Cues☆87Jan 4, 2026Updated 5 months ago
- WildVSR☆22Dec 13, 2023Updated 2 years ago