SuchismitaSahu1993/Lipreading-Using-Mutimodal-Speech-Recognition

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SuchismitaSahu1993/Lipreading-Using-Mutimodal-Speech-Recognition)

SuchismitaSahu1993 / Lipreading-Using-Mutimodal-Speech-Recognition

Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for the audio subnetwork and CNN-LSTMs for the video subnetwork.

☆15

Alternatives and similar repositories for Lipreading-Using-Mutimodal-Speech-Recognition

Users that are interested in Lipreading-Using-Mutimodal-Speech-Recognition are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

haibalabs / face-mesh-to-blendshapes
View on GitHub
☆16Aug 8, 2023Updated 2 years ago
matthijsvk / TCDTIMITprocessing
View on GitHub
processing and extracting of face and mouth image files out of the TCDTIMIT database
☆46Sep 22, 2020Updated 5 years ago
LeeYongHyeok / DCM_vgg_transformer
View on GitHub
Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using…
☆14Jul 2, 2020Updated 5 years ago
lzuwei / end-to-end-multiview-lipreading
View on GitHub
End to End Multiview Lip Reading
☆10Jan 26, 2018Updated 8 years ago
georgesterpu / pyVSR
View on GitHub
Python toolkit for Visual Speech Recognition
☆38Jun 10, 2020Updated 5 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
xtliu97 / audio2face-pytorch
View on GitHub
Pytorch reimplementation of audio driven face mesh or blendshape models, including Audio2Mesh, VOCA, etc
☆17Sep 6, 2024Updated last year
jarret / raspi-uart-waveshare
View on GitHub
A library for interfacing with the 4.3inch UART e-Paper from a Raspberry Pi 2/3 via Python3 with example programs to display QR Codes for…
☆12Mar 9, 2019Updated 7 years ago
TarekVito / ColorCoherenceVector
View on GitHub
Color Coherence Vector is a powerful color-based image retrieval (Matlab)
☆11Feb 27, 2015Updated 11 years ago
lshiwjx / deformable-3d-convnets
View on GitHub
Deformable 3D ConvNets for Action Recognition
☆10Jan 21, 2018Updated 8 years ago
artem179 / WLAS
View on GitHub
The implementation of 'Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to character on p…
☆11Mar 23, 2018Updated 8 years ago
wuyinwuxian / Neural_Network_optimization_method
View on GitHub
这是一个Matlab代码，里面包括五种常见神经网络优化算法的对比。包括SGD、SGDM、Adagrad、AdaDelta、Adam
☆11Mar 23, 2022Updated 4 years ago
ITKaven / RoBMRC
View on GitHub
☆10Mar 24, 2023Updated 3 years ago
qinzzz / Multimodal-Alignment-Framework
View on GitHub
Implementation for MAF: Multimodal Alignment Framework
☆46Nov 25, 2020Updated 5 years ago
Sha-Lab / CMHSE
View on GitHub
The code repository for "Cross-Modal and Hierarchical Modeling of Video and Text" in PyTorch
☆16Apr 22, 2019Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
usc-sail / child-adult-diarization
View on GitHub
public child-adult speaker diarization/classification model and codes
☆18Apr 24, 2025Updated 11 months ago
dingxiaowei / WwiseStudy
View on GitHub
更多精品教程
☆16Sep 16, 2019Updated 6 years ago
YoungSeng / Speech-driven-expressions
View on GitHub
Speech-Driven Expression Blendshape Based on Single-Layer Self-attention Network (AIWIN 2022)
☆78Oct 21, 2022Updated 3 years ago
tsiangleo / TensorFlowMnist
View on GitHub
☆15Apr 27, 2017Updated 8 years ago
WisleyWang / DC-AI-LipReading
View on GitHub
☆11May 31, 2020Updated 5 years ago
corticph / MSTmodel
View on GitHub
Code for https://arxiv.org/abs/1712.00254
☆16Dec 6, 2017Updated 8 years ago
Lenvia / RBM-BP-character-recognition
View on GitHub
RBM+BP神经网络识别手写数字和英文字符
☆11Mar 25, 2023Updated 3 years ago
fushengwuyu / R-Drop
View on GitHub
RDrop 的 torch版
☆16Jul 15, 2021Updated 4 years ago
viduzz84 / SubbandAdaptiveX
View on GitHub
Subband Adaptive System with Crossterms for aliasing reduction
☆17Jul 31, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
lightning830 / E2E-audio-speech-recognition
View on GitHub
Conformer encoder + Transformer decoder with Hybrid CTC/attention
☆12Nov 11, 2021Updated 4 years ago
multitel-ai / urban-sound-tagging
View on GitHub
1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context
☆16Dec 8, 2022Updated 3 years ago
sooftware / speech-transformer
View on GitHub
Transformer implementation speciaized in speech recognition tasks using Pytorch.
☆65Nov 28, 2021Updated 4 years ago
mediatechnologycenter / AvatarForge
View on GitHub
Code for the project: "Audio-Driven Video-Synthesis of Personalised Moderations"
☆21Jan 31, 2024Updated 2 years ago
yuweiwan / ASR-HMM-DNN
View on GitHub
speech recognition based on deep neural network/hidden markov model
☆10Jun 3, 2020Updated 5 years ago
luanshiyinyang / MLP
View on GitHub
Numpy手写BP神经网络，对比Dropout、Batch Normalization等训练技巧的效果。
☆11Dec 19, 2019Updated 6 years ago
Levent9 / Zero-shot-FaceVC
View on GitHub
☆19Mar 2, 2024Updated 2 years ago
webYFDT / hateful
View on GitHub
☆11May 18, 2022Updated 3 years ago
madhurchhajed / Facial-Emotion-Recognition
View on GitHub
Deployed a facial emotion recognition using neural network model which predicts the emotion from faces in images, videos and live feed fr…
☆11May 2, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
gudgud96 / noisy-student-emotion-training
View on GitHub
Submission to MediaEval 2021 Emotions and Themes in Music challenge. Noisy-student training for music emotion tagging
☆11Dec 2, 2021Updated 4 years ago
xmy0916 / pytorch_crnn
View on GitHub
基于pytorch写的CRNN文字识别~简化写法帮助入门
☆13Feb 21, 2021Updated 5 years ago
carl03q / AudioClassifier
View on GitHub
A CNN audio classifier via spectrogram images.
☆10Jul 21, 2017Updated 8 years ago
bahayonghang / drawio-skills
View on GitHub
drawio agent skill
☆69Mar 18, 2026Updated last week
Boyu1997 / mcts-travel-salesman
View on GitHub
Monte Carlo tree search (MCTS) on traveling salesman problem (TSP)
☆22Apr 27, 2019Updated 6 years ago
wjrzm / MotionPRO
View on GitHub
☆18Oct 24, 2025Updated 5 months ago
tobiastoft91 / VGGish_AudioClassifer_02456
View on GitHub
Acoustic Scene Classification using transfer learning on VGGish pre-trained model
☆11Jan 3, 2018Updated 8 years ago