An unofficial (PyTorch) implementation for the paper Deep Lip Reading: A comparison of models and an online application.
☆10May 13, 2020Updated 5 years ago
Alternatives and similar repositories for LipRead-seq2seq
Users that are interested in LipRead-seq2seq are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…☆20Apr 22, 2024Updated last year
- Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)☆23Apr 27, 2024Updated last year
- The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…☆10Oct 12, 2023Updated 2 years ago
- A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech…☆12Jul 24, 2024Updated last year
- Official implementation of Transpotter, published in BMVC 2021☆16Aug 6, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A no_std lib for elf fille loading☆18Oct 13, 2023Updated 2 years ago
- "LipNet: End-to-End Sentence-level Lipreading" in PyTorch☆69Sep 9, 2019Updated 6 years ago
- Official Implementation of Visual Transformer Pooling for Lip reading☆41Aug 8, 2022Updated 3 years ago
- Official code release for "TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion", accepted ICIST 2023☆12Mar 17, 2024Updated 2 years ago
- The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxi…☆235Sep 21, 2022Updated 3 years ago
- [Lab] lab website☆11Mar 23, 2026Updated 3 weeks ago
- Tools for Ahocoder data processing and evaluation metrics☆15Apr 22, 2024Updated last year
- The world's fastest Python package for calculating integrated loudness (LUFS) from audio data as NumPy arrays☆25Dec 26, 2025Updated 3 months ago
- Code and models for evaluating a state-of-the-art lip reading network☆196Mar 24, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆24Mar 30, 2024Updated 2 years ago
- PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)☆25Mar 9, 2024Updated 2 years ago
- ☆31Oct 29, 2024Updated last year
- TF code for our CVPR2020 paper "Discriminative Multi-modality Speech Recognition"☆26Apr 27, 2022Updated 3 years ago
- Facestar dataset. High quality audio-visual recordings of human conversational speech.☆110Mar 29, 2022Updated 4 years ago
- An implementation of http://openaccess.thecvf.com/content_CVPRW_2019/papers/Sight%20and%20Sound/Konstantinos_Vougioukas_End-to-End_Speech…☆18Mar 19, 2020Updated 6 years ago
- Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models☆22Apr 15, 2024Updated last year
- Visual Speech Recognition for Multiple Languages☆465Aug 17, 2023Updated 2 years ago
- ☆35Apr 11, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆25Jul 15, 2024Updated last year
- DistantSpeech☆22Oct 9, 2023Updated 2 years ago
- Auto-AVSR: Lip-Reading Sentences Project☆409Jan 8, 2025Updated last year
- Code for EMNLP 2019 paper "Learning to Update Knowledge Graphs by Reading News"☆29Nov 26, 2019Updated 6 years ago
- ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…☆433May 18, 2023Updated 2 years ago
- Paper list of Video LLM hallucination. Welcome to Star and Contribute!☆23Apr 1, 2026Updated last week
- Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language (AAAI 2025)☆23Mar 17, 2025Updated last year
- ☆30Jun 25, 2020Updated 5 years ago
- ☆12Oct 5, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Attention-based multimodal fusion for sentiment analysis☆13Aug 14, 2018Updated 7 years ago
- ☆19Mar 10, 2023Updated 3 years ago
- Demo for DART, Audio Imagination workshop submission in NeurIPS 2024☆13Apr 15, 2025Updated 11 months ago
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆31Sep 6, 2024Updated last year
- Guide for installing Hackintosh on Dell 7577☆10Aug 17, 2019Updated 6 years ago
- ☆29Feb 16, 2023Updated 3 years ago
- 一个测试各种功能的demo☆12Apr 16, 2020Updated 5 years ago