arxrean/LipRead-seq2seq

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/arxrean/LipRead-seq2seq)

arxrean / LipRead-seq2seq

An unofficial (PyTorch) implementation for the paper Deep Lip Reading: A comparison of models and an online application.

☆10

Alternatives and similar repositories for LipRead-seq2seq

Users that are interested in LipRead-seq2seq are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SpringHuo / MAVD
View on GitHub
The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…
☆20Apr 22, 2024Updated 2 years ago
DataoceanAI / CNVSRC2023Baseline
View on GitHub
Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)
☆23Apr 27, 2024Updated 2 years ago
ms-dot-k / LRW_ID
View on GitHub
The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…
☆10Oct 12, 2023Updated 2 years ago
umbertocappellazzo / Llama-AVSR
View on GitHub
Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners" [ICASSP 2025] and "Mitigat…
☆64Jan 18, 2026Updated 6 months ago
IMLHF / SpecAugmentPyTorch
View on GitHub
A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech…
☆11Jul 24, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
prajwalkr / transpotter
View on GitHub
Official implementation of Transpotter, published in BMVC 2021
☆16Aug 6, 2022Updated 3 years ago
JeongHun0716 / Personalized-Lip-Reading
View on GitHub
Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language (AAAI 2025)
☆24Jun 29, 2026Updated 3 weeks ago
vincenthouyi / elf_rs
View on GitHub
A no_std lib for elf fille loading
☆18Oct 13, 2023Updated 2 years ago
sailordiary / LipNet-PyTorch
View on GitHub
"LipNet: End-to-End Sentence-level Lipreading" in PyTorch
☆70Sep 9, 2019Updated 6 years ago
prajwalkr / vtp
View on GitHub
Official Implementation of Visual Transformer Pooling for Lip reading
☆41Aug 8, 2022Updated 3 years ago
VIPL-Audio-Visual-Speech-Understanding / LipNet-PyTorch
View on GitHub
The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxi…
☆237Sep 21, 2022Updated 3 years ago
spkgyk / TDFNet
View on GitHub
Official code release for "TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion", accepted ICIST 2023
☆14Mar 17, 2024Updated 2 years ago
afourast / deep_lip_reading
View on GitHub
Code and models for evaluating a state-of-the-art lip reading network
☆196Mar 24, 2023Updated 3 years ago
UARK-AICV / UARK-AICV.github.io
View on GitHub
[Lab] lab website
☆12May 29, 2026Updated last month
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
santi-pdp / ahoproc_tools
View on GitHub
Tools for Ahocoder data processing and evaluation metrics
☆15Apr 22, 2024Updated 2 years ago
Exgc / AVMuST-TED
View on GitHub
☆24Mar 30, 2024Updated 2 years ago
cilkim1 / speech_ani_gan
View on GitHub
An implementation of http://openaccess.thecvf.com/content_CVPRW_2019/papers/Sight%20and%20Sound/Konstantinos_Vougioukas_End-to-End_Speech…
☆18Mar 19, 2020Updated 6 years ago
mbzuai-nlp / sttatts
View on GitHub
☆31Oct 29, 2024Updated last year
ms-dot-k / Visual-Context-Attentional-GAN
View on GitHub
PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)
☆25Mar 9, 2024Updated 2 years ago
JackSyu / Discriminative-Multi-modality-Speech-Recognition
View on GitHub
TF code for our CVPR2020 paper "Discriminative Multi-modality Speech Recognition"
☆26Apr 27, 2022Updated 4 years ago
apple / ml-famae
View on GitHub
☆35Apr 11, 2024Updated 2 years ago
mpc001 / Visual_Speech_Recognition_for_Multiple_Languages
View on GitHub
Visual Speech Recognition for Multiple Languages
☆478Aug 17, 2023Updated 2 years ago
esddse / GUpdater
View on GitHub
Code for EMNLP 2019 paper "Learning to Update Knowledge Graphs by Reading News"
☆29Nov 26, 2019Updated 6 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
facebookresearch / facestar
View on GitHub
Facestar dataset. High quality audio-visual recordings of human conversational speech.
☆112Mar 29, 2022Updated 4 years ago
yxduir / LLM-SRT
View on GitHub
☆28Mar 11, 2026Updated 4 months ago
cuilimeng / DETERRENT
View on GitHub
☆30Jun 25, 2020Updated 6 years ago
ThreeSR / Good-Learning-Resources
View on GitHub
☆12Oct 5, 2022Updated 3 years ago
ShareChatAI / 3MASSIV
View on GitHub
☆13May 10, 2022Updated 4 years ago
mpc001 / Lipreading_using_Temporal_Convolutional_Networks
View on GitHub
ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…
☆438May 18, 2023Updated 3 years ago
mpc001 / auto_avsr
View on GitHub
Auto-AVSR: Lip-Reading Sentences Project
☆429Jan 8, 2025Updated last year
0215Arthur / HG-GNN
View on GitHub
☆29Feb 16, 2023Updated 3 years ago
kangliu1225 / MGCL
View on GitHub
The complete codes of the paper "Multimodal Graph Contrastive Learning for Recommendation"
☆15Mar 20, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
SenticNet / multimodal-fusion
View on GitHub
Attention-based multimodal fusion for sentiment analysis
☆13Aug 14, 2018Updated 7 years ago
yakimka / Hackintosh-Dell-7577
View on GitHub
Guide for installing Hackintosh on Dell 7577
☆10Aug 17, 2019Updated 6 years ago
XiaoWuLibs / MyDemo
View on GitHub
一个测试各种功能的demo
☆12Apr 16, 2020Updated 6 years ago
choijeongsoo / utut
View on GitHub
[TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
☆31Sep 6, 2024Updated last year
wangwei2009 / DistantSpeech
View on GitHub
DistantSpeech
☆22Oct 9, 2023Updated 2 years ago
ryanleary / ctcdecode
View on GitHub
PyTorch CTC Decoder bindings
☆42Jan 31, 2018Updated 8 years ago
dkurzend / ClipClap-GZSL
View on GitHub
Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models
☆23Apr 15, 2024Updated 2 years ago