Sreyan88/LipGER

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Sreyan88/LipGER)

Sreyan88 / LipGER

Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition

☆19

Alternatives and similar repositories for LipGER

Users that are interested in LipGER are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tzyll / ChineseHP
View on GitHub
Dataset for Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models in Interspeech 2024.
☆16Jul 4, 2024Updated 2 years ago
rithiksachdev / PostASR-Correction-SLT2024
View on GitHub
☆18Jul 22, 2024Updated 2 years ago
Hypotheses-Paradise / UADF
View on GitHub
☆17May 5, 2024Updated 2 years ago
W-Wu / ERC-SLT22
View on GitHub
Code for "Distribution-based Emotion Recognition in Conversation"
☆18Feb 6, 2023Updated 3 years ago
hongfeixue / StutteringSpeechChallenge
View on GitHub
SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
☆12Jun 11, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
roudimit / whisper-flamingo
View on GitHub
Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation
☆210Jul 29, 2025Updated last year
sungnyun / avsr-temporal-dynamics
View on GitHub
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
☆13Oct 22, 2024Updated last year
ahaliassos / raven
View on GitHub
Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)
☆82Feb 27, 2025Updated last year
W-Wu / DEER
View on GitHub
☆12Aug 25, 2023Updated 2 years ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
wonjune-kang / expressive-speech-retrieval
View on GitHub
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
☆15Aug 18, 2025Updated 11 months ago
revsic / torch-retriever-vc
View on GitHub
PyTorch implementation of Retriever: Learning Content-Style Representation
☆12Jan 27, 2023Updated 3 years ago
yangjingyuan / ConstDecoder
View on GitHub
☆11Oct 24, 2022Updated 3 years ago
Srijith-rkr / Whispering-LLaMA
View on GitHub
EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction
☆271May 19, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
sarulab-speech / ml-audiocaps
View on GitHub
Multi-lingual AudioCaps
☆14Nov 20, 2023Updated 2 years ago
tango4j / llm_speaker_tagging
View on GitHub
SLT 2024 Challenge: Post-ASR-Speaker-Tagging
☆16Jun 16, 2024Updated 2 years ago
amazon-science / contextual-attention-nlm
View on GitHub
Accompanying code for paper "Attention-Based Contextual Language Model Adaptation for Speech Recognition", submitted to ACL 2021.
☆14Jul 25, 2023Updated 3 years ago
Alibaba-NLP / AISHELL-NER
View on GitHub
[ICASSP 2022] AISHELL-NER: Named Entity Recognition from Chinese Speech
☆26Apr 20, 2022Updated 4 years ago
YasserdahouML / VSR_test_set
View on GitHub
WildVSR
☆22Dec 13, 2023Updated 2 years ago
YasserdahouML / visper
View on GitHub
ViSpeR: Multilingual Audio-Visual Speech Recognition
☆59Apr 17, 2025Updated last year
joannahong / AV-RelScore
View on GitHub
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆35Jun 20, 2023Updated 3 years ago
sinhat98 / adapter-wavlm
View on GitHub
☆46Feb 16, 2023Updated 3 years ago
OpenSoraAI / OpenSora
View on GitHub
Exquisite video generation
☆15Feb 18, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
YUCHEN005 / MIR-GAN
View on GitHub
Code for paper "MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recogni…
☆16Jun 21, 2023Updated 3 years ago
AIRC-KETI / Korean-Copora
View on GitHub
☆14Dec 9, 2021Updated 4 years ago
teamtee / LLM-ASR-Error-Correction
View on GitHub
This is a framework for using large language models to improve ASR recognition accuracy. You need to provide the recognized text and tag …
☆18Jun 5, 2025Updated last year
cryingjin / AMIOK
View on GitHub
[제 11회 투빅스 컨퍼런스] AM I OK ? - 전문의 답변 기반 심리진단 AI
☆12Jan 19, 2021Updated 5 years ago
seungheondoh / hi_kia
View on GitHub
wake-up word emotion recognition [APSIPA 2022]
☆17Nov 11, 2022Updated 3 years ago
wxzyd123 / Pivotal_Objective_Perturbation
View on GitHub
[CCS-LAMPS 2024] Mitigating Unauthorized Speech Synthesis for Voice Protection
☆20Nov 1, 2024Updated last year
GeorgeEfstathiadis / LLM-Diarize-ASR-Agnostic
View on GitHub
Repository for "LLM-based speaker diarization correction: A generalizable approach" paper
☆23Jul 31, 2024Updated last year
YUCHEN005 / GILA
View on GitHub
Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"
☆18Jun 21, 2023Updated 3 years ago
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kaistmm / V2SFlow
View on GitHub
[ICASSP 2025] V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow
☆21Jun 3, 2025Updated last year
sungnyun / ARMHuBERT
View on GitHub
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
☆41Aug 29, 2024Updated last year
ga642381 / SpeechPrompt
View on GitHub
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…
☆102Apr 10, 2025Updated last year
linusericsson / ssl-invariances
View on GitHub
Official code for the paper "Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks".
☆16Dec 7, 2021Updated 4 years ago
jordicapde / stutter-former
View on GitHub
StutterFormer is an AI model that aims to be able to receive a speech sample with stuttering disfluencies, and return it with the disflue…
☆19Feb 10, 2023Updated 3 years ago
HappyColor / DrawSpeech_PyTorch
View on GitHub
☆25Nov 25, 2025Updated 8 months ago
frankenliu / LOAE
View on GitHub
☆10Sep 25, 2024Updated last year