xing96/MIM-lipreading

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xing96/MIM-lipreading)

xing96 / MIM-lipreading

Code and model for paper <Mutual Information Maximization for Effective Lip Reading>

☆19

Alternatives and similar repositories for MIM-lipreading

Users that are interested in MIM-lipreading are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jingyunx / Deformation-Flow-Based-Two-stream-Network-for-Lip-Reading
View on GitHub
☆15Dec 11, 2021Updated 4 years ago
NirHeaven / D3D
View on GitHub
The proposed method in LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild
☆26Nov 23, 2018Updated 7 years ago
VIPL-Audio-Visual-Speech-Understanding / deep-face-speechreading
View on GitHub
Visual speech recognition with face inputs: code and models for F&G 2020 paper "Can We Read Speech Beyond the Lips? Rethinking RoI Select…
☆19Apr 12, 2021Updated 5 years ago
VIPL-Audio-Visual-Speech-Understanding / learn-an-effective-lip-reading-model-without-pains
View on GitHub
The PyTorch Code and Model In "Learn an Effective Lip Reading Model without Pains", (https://arxiv.org/abs/2011.07557), which reaches the…
☆168Sep 12, 2025Updated 10 months ago
deeplsd / Merkel-Podcast-Corpus
View on GitHub
This dataset is presented in the paper Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video…
☆12Sep 21, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ms-dot-k / Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆22Apr 11, 2022Updated 4 years ago
mpc001 / end-to-end-lipreading
View on GitHub
Pytorch code for End-to-End Audiovisual Speech Recognition
☆183Nov 18, 2022Updated 3 years ago
ms-dot-k / Multi-head-Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Distinguishing Homophenes using Multi-Head Visual-Audio Memory" (AAAI2022)
☆27Mar 9, 2024Updated 2 years ago
lzuwei / end-to-end-multiview-lipreading
View on GitHub
End to End Multiview Lip Reading
☆10Jan 26, 2018Updated 8 years ago
VIPL-Audio-Visual-Speech-Understanding / VIPL-AVSU-Group
View on GitHub
Collection of works from VIPL-AVSU
☆50Updated this week
mpc001 / Lipreading_using_Temporal_Convolutional_Networks
View on GitHub
ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…
☆438May 18, 2023Updated 3 years ago
ahaliassos / raven
View on GitHub
Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)
☆82Feb 27, 2025Updated last year
ms-dot-k / LRW_ID
View on GitHub
The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…
☆10Oct 12, 2023Updated 2 years ago
artem179 / WLAS
View on GitHub
The implementation of 'Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to character on p…
☆11Mar 23, 2018Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
prajwalkr / vtp
View on GitHub
Official Implementation of Visual Transformer Pooling for Lip reading
☆41Aug 8, 2022Updated 3 years ago
VIPL-Audio-Visual-Speech-Understanding / LipNet-PyTorch
View on GitHub
The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxi…
☆237Sep 21, 2022Updated 3 years ago
whzikaros / g2pL
View on GitHub
The implementation of g2pL with a new open dataset.
☆16May 14, 2023Updated 3 years ago
JackSyu / Discriminative-Multi-modality-Speech-Recognition
View on GitHub
TF code for our CVPR2020 paper "Discriminative Multi-modality Speech Recognition"
☆26Apr 27, 2022Updated 4 years ago
Exgc / OpenSR
View on GitHub
The official implementation of OpenSR (ACL2023 Oral)
☆17Nov 29, 2023Updated 2 years ago
voletiv / lipreading-in-the-wild-experiments
View on GitHub
My experiments in lip reading using deep learning with the LRW dataset
☆54Mar 14, 2021Updated 5 years ago
mpc001 / Visual_Speech_Recognition_for_Multiple_Languages
View on GitHub
Visual Speech Recognition for Multiple Languages
☆478Aug 17, 2023Updated 2 years ago
Exgc / AVMuST-TED
View on GitHub
☆24Mar 30, 2024Updated 2 years ago
kyegomez / MM1
View on GitHub
PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"
☆26Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
eastonYi / end-to-end_asr_pytorch
View on GitHub
Implements of CTC, Speech-Transformer and CIF for end-to-end speech recognition with pytorch
☆23Jul 28, 2020Updated 5 years ago
sailordiary / LipNet-PyTorch
View on GitHub
"LipNet: End-to-End Sentence-level Lipreading" in PyTorch
☆70Sep 9, 2019Updated 6 years ago
Bucknalla / lopy-raspberrypi
View on GitHub
🎮 Use a Raspberry Pi to control a LoPy over UART
☆12Mar 9, 2017Updated 9 years ago
guanxiongsun / STPN
View on GitHub
[ICCV2023] Spatio-temporal Prompting Network for Robust Video Feature Extraction
☆10Aug 17, 2023Updated 2 years ago
wuyinwuxian / Neural_Network_optimization_method
View on GitHub
这是一个Matlab代码，里面包括五种常见神经网络优化算法的对比。包括SGD、SGDM、Adagrad、AdaDelta、Adam
☆11Mar 23, 2022Updated 4 years ago
TarekVito / ColorCoherenceVector
View on GitHub
Color Coherence Vector is a powerful color-based image retrieval (Matlab)
☆11Feb 27, 2015Updated 11 years ago
lilianemomeni / KWS-Net
View on GitHub
Seeing Wake Words: Audio-visual Keyword Spotting
☆67Sep 16, 2020Updated 5 years ago
lshiwjx / deformable-3d-convnets
View on GitHub
Deformable 3D ConvNets for Action Recognition
☆10Jan 21, 2018Updated 8 years ago
YasserdahouML / VSR_test_set
View on GitHub
WildVSR
☆22Dec 13, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
nguyenvulebinh / AVSRCocktail
View on GitHub
Audio-Visual Speech Recognition
☆26Jul 7, 2025Updated last year
vvvb-github / AVSegFormer
View on GitHub
[AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer
☆74Mar 6, 2025Updated last year
saschaschramm / MonteCarloTreeSearch
View on GitHub
This project applies Monte Carlo Tree Search (MCTS) to a simple grid world.
☆10May 30, 2018Updated 8 years ago
VIPL-Audio-Visual-Speech-Understanding / LRW1000--CAS-VSR-W1k
View on GitHub
DenseNet3D Model In "LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild", https://arxiv.org/abs/1810.069…
☆123Mar 13, 2026Updated 4 months ago
chaddy1004 / tSNE-helper
View on GitHub
code to help with tsne plotting
☆16May 19, 2020Updated 6 years ago
SSahuDS / Lipreading-Using-Mutimodal-Speech-Recognition
View on GitHub
Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for…
☆15Jul 27, 2023Updated 2 years ago
tsiangleo / TensorFlowMnist
View on GitHub
☆15Apr 27, 2017Updated 9 years ago