VIPL-Audio-Visual-Speech-Understanding/deep-face-speechreading

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/VIPL-Audio-Visual-Speech-Understanding/deep-face-speechreading)

VIPL-Audio-Visual-Speech-Understanding / deep-face-speechreading

Visual speech recognition with face inputs: code and models for F&G 2020 paper "Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition"

☆19

Alternatives and similar repositories for deep-face-speechreading

Users that are interested in deep-face-speechreading are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xing96 / MIM-lipreading
View on GitHub
Code and model for paper <Mutual Information Maximization for Effective Lip Reading>
☆19Sep 4, 2020Updated 5 years ago
VIPL-Audio-Visual-Speech-Understanding / learn-an-effective-lip-reading-model-without-pains
View on GitHub
The PyTorch Code and Model In "Learn an Effective Lip Reading Model without Pains", (https://arxiv.org/abs/2011.07557), which reaches the…
☆168Sep 12, 2025Updated 10 months ago
ms-dot-k / Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆22Apr 11, 2022Updated 4 years ago
jingyunx / Deformation-Flow-Based-Two-stream-Network-for-Lip-Reading
View on GitHub
☆15Dec 11, 2021Updated 4 years ago
prajwalkr / vtp
View on GitHub
Official Implementation of Visual Transformer Pooling for Lip reading
☆41Aug 8, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
sailordiary / LipNet-PyTorch
View on GitHub
"LipNet: End-to-End Sentence-level Lipreading" in PyTorch
☆70Sep 9, 2019Updated 6 years ago
mpc001 / Lipreading_using_Temporal_Convolutional_Networks
View on GitHub
ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…
☆437May 18, 2023Updated 3 years ago
lzuwei / end-to-end-multiview-lipreading
View on GitHub
End to End Multiview Lip Reading
☆10Jan 26, 2018Updated 8 years ago
dobby-seo / Pytorch-MHAtt-RNN-KWS
View on GitHub
Multi-Head-Attention RNN pytorch implement for keyword spotting
☆19Nov 13, 2020Updated 5 years ago
VIPL-Audio-Visual-Speech-Understanding / LipNet-PyTorch
View on GitHub
The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxi…
☆237Sep 21, 2022Updated 3 years ago
georgesterpu / pyVSR
View on GitHub
Python toolkit for Visual Speech Recognition
☆37Jun 10, 2020Updated 6 years ago
cadia-lvl / kaldi-speaker-diarization
View on GitHub
This repository creates speaker diarization recipes to be used within the egs folder of kaldi.
☆17Aug 12, 2024Updated last year
Charbel199 / ml-concepts
View on GitHub
A repository created to keep track of all the useful machine learning concepts that I learn throughout the years along with some resourc…
☆12Nov 26, 2024Updated last year
mpc001 / end-to-end-lipreading
View on GitHub
Pytorch code for End-to-End Audiovisual Speech Recognition
☆183Nov 18, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
srinivr / kaldi-long-audio-alignment
View on GitHub
Long audio alignment using Kaldi
☆23Apr 22, 2021Updated 5 years ago
Li-Sanze / ID-Card
View on GitHub
给定一张身份证正、反面，识别身份证上的所有文字信息
☆10Sep 4, 2019Updated 6 years ago
takumakanari / japanese-numbers-python
View on GitHub
A parser for Japanese number (Kanji, arabic) in the natural language.
☆21Apr 4, 2020Updated 6 years ago
NirHeaven / D3D
View on GitHub
The proposed method in LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild
☆26Nov 23, 2018Updated 7 years ago
mysee1989 / GraphJigsaw
View on GitHub
Code for the paper: Graph Jigsaw Learning for Cartoon Face Recognition
☆10Jul 1, 2022Updated 4 years ago
tomaarsen / TTSTextNormalization
View on GitHub
Convert English text from written expressions into spoken forms
☆32Jun 22, 2022Updated 4 years ago
AntXinyuan / SSP
View on GitHub
Semantic-decoupled Spatial Partition Guided Point-supervised Oriented Object Detection
☆13Jul 7, 2026Updated 2 weeks ago
FateScript / nnprof
View on GitHub
profile tools for pytorch nn models
☆42Jan 11, 2021Updated 5 years ago
LiuDongyang6 / FCFD
View on GitHub
Official implementation of the paper "Function-Consistent Feature Distillation" (ICLR 2023)
☆30Jul 5, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
SSahuDS / Lipreading-Using-Mutimodal-Speech-Recognition
View on GitHub
Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for…
☆15Jul 27, 2023Updated 2 years ago
WisleyWang / DC-AI-LipReading
View on GitHub
☆11May 31, 2020Updated 6 years ago
nlp-waseda / traveling-across-languages
View on GitHub
Official repo and evaluation implementation of KnowRecall and VisRecall
☆10May 22, 2025Updated last year
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
Lenvia / RBM-BP-character-recognition
View on GitHub
RBM+BP神经网络识别手写数字和英文字符
☆11Mar 25, 2023Updated 3 years ago
VIPL-Audio-Visual-Speech-Understanding / LRW1000--CAS-VSR-W1k
View on GitHub
DenseNet3D Model In "LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild", https://arxiv.org/abs/1810.069…
☆123Mar 13, 2026Updated 4 months ago
mashrurmorshed / Torch-KWT
View on GitHub
Unofficial PyTorch implementation of "Keyword Transformer: A Self-Attention Model for Keyword Spotting", Berg et al. 2021.
☆41Oct 11, 2022Updated 3 years ago
ichn-hu / DSP-Audio-Collector
View on GitHub
Web app created to collect audios for course project
☆10Apr 6, 2018Updated 8 years ago
Zhong-master / PocketSphinx_Speech_Recognition
View on GitHub
PocketSphinx_Speech_Recognition
☆10Aug 5, 2021Updated 4 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
AI-Research-BD / Keyword-MLP
View on GitHub
Official PyTorch implementation of "Attention-Free Keyword Spotting", Mashrur. M. Morshed & Ahmad Omar Ahsan, PML4DC @ ICLR 2022.
☆15Nov 5, 2022Updated 3 years ago
XuyangGuo / STD-GAN
View on GitHub
Instance-level Facial Attributes Editing (CVIU 2021)
☆15Jul 19, 2022Updated 4 years ago
yuweiwan / ASR-HMM-DNN
View on GitHub
speech recognition based on deep neural network/hidden markov model
☆10Jun 3, 2020Updated 6 years ago
xmy0916 / pytorch_crnn
View on GitHub
基于pytorch写的CRNN文字识别~简化写法帮助入门
☆13Feb 21, 2021Updated 5 years ago
AkashKV-1998 / Warehouse-Management-System
View on GitHub
The successful and effective management of a busy and complex warehouse relies upon the control and location of stock within the warehous…
☆15Mar 17, 2021Updated 5 years ago
arubique / OCCAM
View on GitHub
This is an implementation of the paper "Are We Done with Object-Centric Learning?"
☆13Jun 21, 2026Updated last month
Alex-Riviello / KWS_MCU
View on GitHub
☆16May 8, 2022Updated 4 years ago