mpc001 / Visual_Speech_Recognition_for_Multiple_LanguagesLinks

Visual Speech Recognition for Multiple Languages

☆422

Alternatives and similar repositories for Visual_Speech_Recognition_for_Multiple_Languages

Users that are interested in Visual_Speech_Recognition_for_Multiple_Languages are comparing it to the libraries listed below

Sorting:

mpc001 / Lipreading_using_Temporal_Convolutional_Networks
ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…
☆419Updated 2 years ago
smeetrs / deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
☆234Updated last year
TaoRuijie / TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
☆399Updated last year
mpc001 / auto_avsr
Auto-AVSR: Lip-Reading Sentences Project
☆362Updated 6 months ago
prajwalkr / vtp
Official Implementation of Visual Transformer Pooling for Lip reading
☆40Updated 2 years ago
facebookresearch / av_hubert
A self-supervised learning framework for audio-visual speech
☆921Updated last year
VIPL-Audio-Visual-Speech-Understanding / learn-an-effective-lip-reading-model-without-pains
The PyTorch Code and Model In "Learn an Effective Lip Reading Model without Pains", (https://arxiv.org/abs/2011.07557), which reaches the…
☆163Updated 2 years ago
facebookresearch / VisualVoice
Audio-Visual Speech Separation with Cross-Modal Consistency
☆232Updated last year
joonson / syncnet_python
Out of time: automated lip sync in the wild
☆795Updated last year
Chris10M / Lip2Speech
A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.
☆86Updated 3 years ago
Junhua-Liao / Light-ASD
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
☆147Updated 3 months ago
CheyneyComputerScience / CREMA-D
Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D)
☆448Updated 3 months ago
ControlNet / MARLIN
[CVPR] MARLIN: Masked Autoencoder for facial video Representation LearnINg
☆254Updated 3 months ago
ddlBoJack / emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training fo…
☆901Updated 6 months ago
afourast / deep_lip_reading
Code and models for evaluating a state-of-the-art lip reading network
☆195Updated 2 years ago
KAIST-AILab / SyncVSR
[Interspeech 2024] SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
☆56Updated 3 months ago
burchim / AVEC
[WACV 2023] Audio-Visual Efficient Conformer (AVEC) for Robust Speech Recognition
☆97Updated 2 years ago
JeffC0628 / awesome-voice-conversion
A curated list of awesome voice conversion, projects and communities.
☆239Updated 6 months ago
VIPL-Audio-Visual-Speech-Understanding / LipNet-PyTorch
The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxi…
☆225Updated 2 years ago
joonson / syncnet_trainer
Disentangled Speech Embeddings using Cross-Modal Self-Supervision
☆160Updated 5 years ago
uniBruce / Mead
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]
☆266Updated last year
ASR-project / Multilingual-PR
Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differen…
☆235Updated 3 years ago
cristinalunaj / MMEmotionRecognition
Repository with the code of the paper: A proposal for Multimodal Emotion Recognition using auraltransformers and Action Units on RAVDESS …
☆106Updated last year
HLTSingapore / Emotional-Speech-Data
This is the GitHub page for publicly available emotional speech data.
☆357Updated 3 years ago
walkoncross / voxceleb2-download-zyf
Tools for downloading VoxCeleb2 dataset
☆30Updated last year
joannahong / Lip2Wav-pytorch
a PyTorch implementation of Lip2Wav
☆51Updated 2 years ago
abikaki / awesome-speech-emotion-recognition
😎 Awesome lists about Speech Emotion Recognition
☆93Updated 6 months ago
hhj1897 / face_detection
☆46Updated last year
Jiaxin-Ye / TIM-Net_SER
[ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech E…
☆175Updated last year
huawei-noah / Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
☆590Updated last year