Multimodal speech recognition using lipreading (with CNNs) and audio (using LSTMs). Sensor fusion is done with an attention network.
☆69Nov 19, 2022Updated 3 years ago
Alternatives and similar repositories for multimodalSR
Users that are interested in multimodalSR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A fine multimodality fusion network :)☆10Aug 9, 2021Updated 4 years ago
- Google Summer of Code 2017 Project: Development of Speech Recognition Module for Red Hen Lab☆44Aug 29, 2017Updated 8 years ago
- Speech recognition on the TIMIT (or any other) dataset☆44Nov 2, 2017Updated 8 years ago
- processing and extracting of face and mouth image files out of the TCDTIMIT database☆47Sep 22, 2020Updated 5 years ago
- Code for our paper "Acoustic Features Fusion using Attentive Multi-channel Deep Architecture" in Keras and tensorflow☆26Nov 23, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"☆12Apr 27, 2023Updated 3 years ago
- Pytorch code for End-to-End Audiovisual Speech Recognition☆184Nov 18, 2022Updated 3 years ago
- (semi) Grapheme-to-Phoneme (G2P) - seq2seq model using PyTorch for Korean☆23Dec 17, 2017Updated 8 years ago
- Multimodal Emotion Recognition in a video using feature level fusion of audio and visual modalities☆15Jul 5, 2018Updated 7 years ago
- Correspondence and autoencoder neural network training for speech using Pylearn2.