YUCHEN005/MIR-GAN

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YUCHEN005/MIR-GAN)

YUCHEN005 / MIR-GAN

Code for paper "MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition"

☆16

Alternatives and similar repositories for MIR-GAN

Users that are interested in MIR-GAN are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

YUCHEN005 / GILA
View on GitHub
Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"
☆18Jun 21, 2023Updated 3 years ago
YUCHEN005 / UNA-GAN
View on GitHub
Code for paper "Unsupervised Noise adaptation using Data Simulation"
☆14May 16, 2024Updated 2 years ago
YUCHEN005 / UniVPM
View on GitHub
Code for paper "Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition"
☆28Jun 21, 2023Updated 3 years ago
YUCHEN005 / RATS-Channel-A-Speech-Data
View on GitHub
This is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log…
☆16Oct 22, 2022Updated 3 years ago
YUCHEN005 / Gradient-Remedy
View on GitHub
Code for paper "Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition"
☆21May 24, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
YUCHEN005 / Unified-Enhance-Separation
View on GitHub
Code for paper "Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation"
☆45Jul 10, 2024Updated 2 years ago
YUCHEN005 / DPSL-ASR
View on GitHub
Code for paper "Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition"
☆44May 23, 2023Updated 3 years ago
YUCHEN005 / NASE
View on GitHub
Code for paper "Noise-aware Speech Enhancement using Diffusion Probabilistic Model"
☆89Jun 10, 2024Updated 2 years ago
shikiw / Modality-Integration-Rate
View on GitHub
[ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…
☆113Jul 9, 2025Updated last year
YUCHEN005 / RobustGER
View on GitHub
Code for paper "Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"
☆143May 8, 2024Updated 2 years ago
Hypotheses-Paradise / UADF
View on GitHub
☆17May 5, 2024Updated 2 years ago
YUCHEN005 / STAR-Adapt
View on GitHub
Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"
☆241May 24, 2024Updated 2 years ago
YUCHEN005 / GenTranslate
View on GitHub
Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"
☆199Jul 22, 2024Updated 2 years ago
Sreyan88 / LipGER
View on GitHub
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆19Jul 16, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zhang-wy15 / Attack_practical_asv
View on GitHub
ICASSP 2021 accepted paper
☆20May 20, 2021Updated 5 years ago
W-Wu / DEER
View on GitHub
☆12Aug 25, 2023Updated 2 years ago
smallflyingpig / learning-to-fool-the-speaker-recognition
View on GitHub
code for paper "learning to fool the speaker recognition"
☆10Jun 12, 2020Updated 6 years ago
shirley-wu / daco
View on GitHub
[NeurIPS 2024 D&B Track] DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation
☆14Mar 5, 2025Updated last year
Miffyli / asv-cm-reinforce
View on GitHub
Optimizing speaker verification and spoofing countermeasure systems together with REINFORCE
☆13Mar 31, 2021Updated 5 years ago
Bose / RAVEN
View on GitHub
☆20Oct 6, 2025Updated 9 months ago
swagshaw / Rainbow-Keywords
View on GitHub
Rainbow Keywords - Official PyTorch Implementation
☆14Jun 27, 2024Updated 2 years ago
roger-tseng / av-superb
View on GitHub
A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
☆58Apr 17, 2024Updated 2 years ago
ZhuoYulang / CIF-MMIN
View on GitHub
☆41Apr 16, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
CFSRgroup / Paozival
View on GitHub
☆13Jan 25, 2024Updated 2 years ago
JeongHun0716 / Personalized-Lip-Reading
View on GitHub
Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language (AAAI 2025)
☆24Jun 29, 2026Updated last month
enoche / DGVAE
View on GitHub
Disentangled Graph Variational Auto-Encoder for Multimodal Recommendation with Interpretability, IEEE TMM
☆16Jun 3, 2025Updated last year
cug-ygh / TMT
View on GitHub
☆21Jun 4, 2024Updated 2 years ago
yxduir / LLM-SRT
View on GitHub
☆28Mar 11, 2026Updated 4 months ago
fsepteixeira / FoolHD
View on GitHub
Repository for the source code and adversarial samples of FoolHD
☆18Jan 4, 2022Updated 4 years ago
joannahong / AV-RelScore
View on GitHub
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆35Jun 20, 2023Updated 3 years ago
zhiqu22 / mitre
View on GitHub
☆12Jun 30, 2025Updated last year
NaoyukiKanda / LibriSpeechMix
View on GitHub
☆38Mar 30, 2021Updated 5 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
archiki / Robust-E2E-ASR
View on GitHub
This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 20…
☆49Dec 25, 2024Updated last year
ms-dot-k / Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆22Apr 11, 2022Updated 4 years ago
zaocan666 / DyViSE
View on GitHub
Dynamic vision-guided speaker embedding for audio-visual speaker diarization
☆12Jul 5, 2022Updated 4 years ago
ShiningLab / POS-Tagger-for-Punctuation-Restoration
View on GitHub
This repository is for the paper Incorporating External POS Tagger for Punctuation Restoration. Proc. Interspeech 2021, 1987-1991, doi: 1…
☆11May 24, 2026Updated 2 months ago
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 4 months ago
adamcsvarga / pcfg-insideoutside
View on GitHub
Inside-Outside PCFG Training Tool
☆12Mar 29, 2016Updated 10 years ago
wutong8023 / SpeechRE
View on GitHub
☆11Nov 11, 2022Updated 3 years ago