matthijsvk/TCDTIMITprocessing

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/matthijsvk/TCDTIMITprocessing)

matthijsvk / TCDTIMITprocessing

processing and extracting of face and mouth image files out of the TCDTIMIT database

☆47

Alternatives and similar repositories for TCDTIMITprocessing

Users that are interested in TCDTIMITprocessing are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SSahuDS / Lipreading-Using-Mutimodal-Speech-Recognition
View on GitHub
Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for…
☆15Jul 27, 2023Updated 2 years ago
ms-dot-k / LRW_ID
View on GitHub
The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…
☆10Oct 12, 2023Updated 2 years ago
ms-dot-k / Visual-Context-Attentional-GAN
View on GitHub
PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)
☆25Mar 9, 2024Updated 2 years ago
matthijsvk / TIMITspeech
View on GitHub
Speech recognition on the TIMIT (or any other) dataset
☆44Nov 2, 2017Updated 8 years ago
matthijsvk / multimodalSR
View on GitHub
Multimodal speech recognition using lipreading (with CNNs) and audio (using LSTMs). Sensor fusion is done with an attention network.
☆69Nov 19, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ahmadikalkhorani / AVCrossNet
View on GitHub
☆16Jul 4, 2024Updated 2 years ago
ms-dot-k / Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆22Apr 11, 2022Updated 4 years ago
mpc001 / Lipreading_using_Temporal_Convolutional_Networks
View on GitHub
ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…
☆437May 18, 2023Updated 3 years ago
ahaliassos / raven
View on GitHub
Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)
☆82Feb 27, 2025Updated last year
3loi / MSP_Face
View on GitHub
☆13Nov 15, 2024Updated last year
ms-dot-k / Multi-head-Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Distinguishing Homophenes using Multi-Head Visual-Audio Memory" (AAAI2022)
☆27Mar 9, 2024Updated 2 years ago
arxrean / LipRead-seq2seq
View on GitHub
An unofficial (PyTorch) implementation for the paper Deep Lip Reading: A comparison of models and an online application.
☆10May 13, 2020Updated 6 years ago
prajwalkr / transpotter
View on GitHub
Official implementation of Transpotter, published in BMVC 2021
☆16Aug 6, 2022Updated 3 years ago
mpc001 / end-to-end-lipreading
View on GitHub
Pytorch code for End-to-End Audiovisual Speech Recognition
☆183Nov 18, 2022Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
georgesterpu / avsr-tf1
View on GitHub
Audio-Visual Speech Recognition using Sequence to Sequence Models
☆84Jul 10, 2020Updated 6 years ago
joannahong / AV-RelScore
View on GitHub
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆35Jun 20, 2023Updated 3 years ago
Bose / RAVEN
View on GitHub
☆20Oct 6, 2025Updated 9 months ago
joannahong / Lip2Wav-pytorch
View on GitHub
a PyTorch implementation of Lip2Wav
☆50Oct 2, 2022Updated 3 years ago
NirHeaven / D3D
View on GitHub
The proposed method in LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild
☆26Nov 23, 2018Updated 7 years ago
smeetrs / deep_avsr
View on GitHub
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
☆244Feb 15, 2024Updated 2 years ago
a-nagrani / SVHF-Net
View on GitHub
SVHF-Net for Cross-modal binary matching
☆32Aug 22, 2018Updated 7 years ago
ajinkyaT / Lip_Reading_in_the_Wild_AVSR
View on GitHub
Audio-Visual Speech Recognition using Deep Learning
☆61Nov 14, 2018Updated 7 years ago
choijeongsoo / utut
View on GitHub
[TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
☆31Sep 6, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
LeeYongHyeok / DCM_vgg_transformer
View on GitHub
Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using…
☆14Jul 2, 2020Updated 6 years ago
gevangelopoulos / timit-lstm
View on GitHub
Long Short-Term Memory Neural Networks trained and tested on the TIMIT Acoustic-Phonetic Continuous Speech Corpus.
☆11Aug 27, 2017Updated 8 years ago
Exgc / AVMuST-TED
View on GitHub
☆24Mar 30, 2024Updated 2 years ago
georgesterpu / pyVSR
View on GitHub
Python toolkit for Visual Speech Recognition
☆37Jun 10, 2020Updated 6 years ago
afourast / deep_lip_reading
View on GitHub
Code and models for evaluating a state-of-the-art lip reading network
☆196Mar 24, 2023Updated 3 years ago
mpc001 / Visual_Speech_Recognition_for_Multiple_Languages
View on GitHub
Visual Speech Recognition for Multiple Languages
☆478Aug 17, 2023Updated 2 years ago
vskadandale / vocalist
View on GitHub
Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
☆73Apr 7, 2024Updated 2 years ago
lzuwei / end-to-end-multiview-lipreading
View on GitHub
End to End Multiview Lip Reading
☆10Jan 26, 2018Updated 8 years ago
LUMIA-Group / Leveraging-Self-Supervised-Learning-for-AVSR
View on GitHub
Official PyTorch implementation of paper Leveraging Unimodal Self Supervised Learning for Multimodal Audio-Visual Speech Recognition (ACL…
☆67Jul 13, 2022Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Rudrabha / Lip2Wav
View on GitHub
This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech S…
☆713Jul 6, 2023Updated 3 years ago
jyhan03 / dpccn
View on GitHub
This repository provides an implementation of the DPCCN model for single-channel speech separation. More details will be updated soon.
☆13Dec 8, 2021Updated 4 years ago
Imbecillus / kardinal-o-mat
View on GitHub
☆24May 11, 2025Updated last year
sungnyun / avsr-temporal-dynamics
View on GitHub
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
☆13Oct 22, 2024Updated last year
kaistmm / FlowAVSE
View on GitHub
☆27Jul 15, 2024Updated 2 years ago
facebookresearch / av_hubert
View on GitHub
A self-supervised learning framework for audio-visual speech
☆993Dec 7, 2023Updated 2 years ago
TimeChi / Lip_Reading_Competition
View on GitHub
2019年“创青春.交子杯”新网银行高校金融科技挑战赛-AI算法赛道比赛_代码分享
☆89Jul 15, 2020Updated 6 years ago