matthijsvk / multimodalSRLinks

Multimodal speech recognition using lipreading (with CNNs) and audio (using LSTMs). Sensor fusion is done with an attention network.

☆69

Alternatives and similar repositories for multimodalSR

Users that are interested in multimodalSR are comparing it to the libraries listed below

Sorting:

albanie / mcnCrossModalEmotions
Supporting code for "Emotion Recognition in Speech using Cross-Modal Transfer in the Wild"
☆103Updated 5 years ago
KrishnaDN / speech-emotion-recognition-using-self-attention
Implementation of the paper "Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning" From I…
☆57Updated 4 years ago
30stomercury / Interaction-Aware-Attention-Network
[ICASSP19] An Interaction-aware Attention Network for Speech Emotion Recognition in Spoken Dialogs
☆35Updated 5 years ago
georgesterpu / avsr-tf1
Audio-Visual Speech Recognition using Sequence to Sequence Models
☆82Updated 5 years ago
anuragkr90 / weak_feature_extractor
☆59Updated 7 years ago
trexwithoutt / Speech-Emotion-Recognition-utterancelevel-DNN
Inspired work by the project of SER using ELM at Microsoft Research
☆19Updated 7 years ago
gogyzzz / localatt_emorecog
A Pytorch implementation of 'AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION'
☆41Updated 7 years ago
ajinkyaT / Lip_Reading_in_the_Wild_AVSR
Audio-Visual Speech Recognition using Deep Learning
☆60Updated 6 years ago
bill9800 / speech_separation
Include some core functions and model to handle speech separation
☆155Updated 4 years ago
ankitshah009 / WALNet-Weak_Label_Analysis
Repository for Weak Label Learning for Audio Events - A closer look. Uses Audioset subset data provided for reproducibility.
☆32Updated last year
end2you / end2you
☆110Updated 2 years ago
AudioVisualEmotionChallenge / AVEC2018
Baseline scripts of the 8th Audio/Visual Emotion Challenge (AVEC 2018)
☆60Updated 7 years ago
mravanelli / pytorch_MLP_for_ASR
This code implements a basic MLP for speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and dec…
☆38Updated 7 years ago
bagustris / SER_ICSigSys2019
Repository of code for Speech emotion recognition using voiced speech and attention model, submitted to ICSigSys 2019
☆13Updated 5 years ago
MaigoAkisame / cmu-thesis
Code for Yun Wang's PhD Thesis: Polyphonic Sound Event Detection with Weak Labeling
☆168Updated 3 years ago
karolpiczak / paper-2017-DCASE
The details that matter: Frequency resolution of spectrograms in acoustic scene classification - paper replication data
☆39Updated 7 years ago
aishoot / Speech_Feature_Extraction
Feature extraction of speech signal is the initial stage of any speech recognition system.
☆93Updated 4 years ago
astorfi / 3D-convolutional-speaker-recognition-pytorch
Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
☆122Updated 6 years ago
khaotik / DaNet-Tensorflow
Tensorflow implementation of "Speaker-independent Speech Separation with Deep Attractor Network"
☆90Updated 4 years ago
rajathkmp / speaker-verification
Implementation of state of the art d-vector approach for speaker verification
☆127Updated 7 years ago
vladimir-chernykh / emotion_recognition
CTC for emotion recognition
☆61Updated 8 years ago
dr-pato / audio_visual_speech_enhancement
Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
☆109Updated last year
swshon / voxceleb-ivector
Voxceleb1 i-vector based speaker recognition system
☆43Updated 7 years ago
joaoantoniocn / AM-SincNet
The Additive Margin SincNet (AM-SincNet) is a new approach for speaker recognition problems which is based in the neural network architec…
☆45Updated last year
DCASE-REPO / dcase2018_baseline
DCASE 2018 Baseline systems
☆129Updated 5 years ago
georgesterpu / pyVSR
Python toolkit for Visual Speech Recognition
☆37Updated 5 years ago
matthijsvk / TCDTIMITprocessing
processing and extracting of face and mouth image files out of the TCDTIMIT database
☆45Updated 4 years ago
zhilangtaosha / SpeakerVerification_AMSoftmax_pytorch
SE-Resnet+AMSoftmax for Speaker Verification
☆47Updated 6 years ago
DeepLearn-lab / Acoustic-Feature-Fusion_Chime18
Code for our paper "Acoustic Features Fusion using Attentive Multi-channel Deep Architecture" in Keras and tensorflow
☆26Updated 6 years ago
edufonseca / icassp19
Public repository for the paper "Learning Sound Event Classifiers from Web Audio with Noisy Labels"
☆98Updated 6 years ago