andrewowens / multisensoryLinks

Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features

☆220

Alternatives and similar repositories for multisensory

Users that are interested in multisensory are comparing it to the libraries listed below

Sorting:

rohitrango / objects-that-sound
Unofficial Implementation of Google Deepmind's paper `Objects that Sound`
☆83Updated 7 years ago
tstafylakis / Lipreading-ResNet
Torch code for using Residual Networks with LSTMs for Lipreading
☆99Updated 7 years ago
facebookresearch / FAIR-Play
2.5D visual sound dataset
☆101Updated 4 years ago
rhgao / Deep-MIML-Network
Learning to Separate Object Sounds by Watching Unlabeled Video (ECCV 2018)
☆51Updated 6 years ago
rhgao / co-separation
Co-Separating Sounds of Visual Objects (ICCV 2019)
☆97Updated 2 years ago
dharwath / DAVEnet-pytorch
Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch
☆65Updated 7 years ago
afourast / avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆113Updated 4 years ago
eborboihuc / SoundNet-tensorflow
TensorFlow implementation of "SoundNet".
☆145Updated 7 years ago
hangzhaomit / Sound-of-Pixels
Codebase for ECCV18 "The Sound of Pixels"
☆386Updated 3 years ago
ajinkyaT / Lip_Reading_in_the_Wild_AVSR
Audio-Visual Speech Recognition using Deep Learning
☆61Updated 6 years ago
facebookresearch / 2.5D-Visual-Sound
2.5D visual sound
☆116Updated 2 years ago
facebookresearch / Listen-to-Look
Listen to Look: Action Recognition by Previewing Audio (CVPR 2020)
☆129Updated 4 years ago
voletiv / lipreading-in-the-wild-experiments
My experiments in lip reading using deep learning with the LRW dataset
☆52Updated 4 years ago
ardasnck / learning_to_localize_sound_source
Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes
☆93Updated 10 months ago
roudimit / MUSIC_dataset
MUSIC Dataset from The Sound of Pixels (ECCV '18)
☆130Updated 3 years ago
YapengTian / AVE-ECCV18
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
☆193Updated 4 years ago
a-nagrani / SVHF-Net
SVHF-Net for Cross-modal binary matching
☆32Updated 7 years ago
facebookresearch / AVID-CMA
Audio Visual Instance Discrimination with Cross-Modal Agreement
☆130Updated 4 years ago
qiuqiangkong / audioset_classification
☆227Updated 5 years ago
mpc001 / end-to-end-lipreading
Pytorch code for End-to-End Audiovisual Speech Recognition
☆182Updated 2 years ago
hche11 / VGGSound
VGGSound: A Large-scale Audio-Visual Dataset
☆336Updated 4 years ago
changil / avspeech-downloader
AVSpeech downloader
☆69Updated 6 years ago
euancrabtree / Lipreading-PyTorch
Lip Reading in the Wild using ResNet and LSTMs in PyTorch
☆58Updated 7 years ago
SheldonTsui / SepStereo_ECCV2020
Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)
☆72Updated 5 years ago
hrbigelow / ae-wavenet
Wavenet Autoencoder for Unsupervised speech representation learning (after Chorowski, Jan 2019)
☆176Updated 5 years ago
matthijsvk / TCDTIMITprocessing
processing and extracting of face and mouth image files out of the TCDTIMIT database
☆46Updated 5 years ago
albanie / mcnCrossModalEmotions
Supporting code for "Emotion Recognition in Speech using Cross-Modal Transfer in the Wild"
☆105Updated 6 years ago
Hangz-nju-cuhk / Vision-Infused-Audio-Inpainter-VIAI
Code for Vision-Infused Deep Audio Inpainting (ICCV 2019)
☆57Updated 6 years ago
georgesterpu / pyVSR
Python toolkit for Visual Speech Recognition
☆38Updated 5 years ago
zfang399 / AlignNet
AlignNet: A Unifying Approach to Audio-Visual Alignment (WACV 2020)
☆33Updated 4 years ago