rohitrango / objects-that-soundLinks

Unofficial Implementation of Google Deepmind's paper `Objects that Sound`

☆83

Alternatives and similar repositories for objects-that-sound

Users that are interested in objects-that-sound are comparing it to the libraries listed below

Sorting:

dharwath / DAVEnet-pytorch
Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch
☆65Updated 7 years ago
rhgao / Deep-MIML-Network
Learning to Separate Object Sounds by Watching Unlabeled Video (ECCV 2018)
☆51Updated 6 years ago
andrewowens / multisensory
Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
☆221Updated 6 years ago
facebookresearch / AVID-CMA
Audio Visual Instance Discrimination with Cross-Modal Agreement
☆130Updated 4 years ago
facebookresearch / FAIR-Play
2.5D visual sound dataset
☆101Updated 4 years ago
facebookresearch / Listen-to-Look
Listen to Look: Action Recognition by Previewing Audio (CVPR 2020)
☆129Updated 4 years ago
afourast / avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆113Updated 5 years ago
roudimit / MUSIC_dataset
MUSIC Dataset from The Sound of Pixels (ECCV '18)
☆131Updated 3 years ago
YapengTian / AVE-ECCV18
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
☆193Updated 4 years ago
rhgao / co-separation
Co-Separating Sounds of Visual Objects (ICCV 2019)
☆97Updated 2 years ago
DTaoo / Discriminative-Sounding-Objects-Localization
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆59Updated 3 years ago
eborboihuc / SoundNet-tensorflow
TensorFlow implementation of "SoundNet".
☆145Updated 7 years ago
tstafylakis / Lipreading-ResNet
Torch code for using Residual Networks with LSTMs for Lipreading
☆99Updated 7 years ago
HumamAlwassel / XDC
Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)
☆91Updated 3 years ago
ardasnck / learning_to_localize_sound_source
Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes
☆94Updated 11 months ago
Kajiyu / LLLNet
Keras Implementation of "Look, Listen and Learn" Model
☆21Updated 8 years ago
facebookresearch / selavi
This repo covers the implementation for Labelling unlabelled videos from scratch with multi-modal self-supervision, which learns clusters…
☆117Updated 4 years ago
shayangharib / AUDASC
Adversarial Unsupervised Domain Adaptation for Acoustic Scene Classification
☆37Updated 7 years ago
csehong / VM-NET
Content-Based Video-Music Retrieval using Soft Intra-Modal Structure Constraint
☆62Updated 8 years ago
facebookresearch / 2.5D-Visual-Sound
2.5D visual sound
☆116Updated 2 years ago
a-nagrani / SVHF-Net
SVHF-Net for Cross-modal binary matching
☆32Updated 7 years ago
marl / l3embedding
Learn and L3 embedding from audio/video pairs
☆88Updated 3 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
Localizing Visual Sounds the Hard Way
☆82Updated 3 years ago
hche11 / VGGSound
VGGSound: A Large-scale Audio-Visual Dataset
☆341Updated 4 years ago
jingliao132 / CrossModalRetrieval
Pytorch implementation of 'See, Hear, and Read: Deep Aligned Representations'
☆33Updated 6 years ago
bobchennan / sparse_image_warp_pytorch
Pytorch implementation of sparse_image_warp and an example of GoogleBrain's SpecAugment is given: A Simple Data Augmentation Method for A…
☆24Updated 6 years ago
qiuqiangkong / audioset_classification
☆228Updated 5 years ago
facebookresearch / EgoCom-Dataset
EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset
☆58Updated 4 years ago
hrbigelow / ae-wavenet
Wavenet Autoencoder for Unsupervised speech representation learning (after Chorowski, Jan 2019)
☆176Updated 5 years ago
Hangz-nju-cuhk / Vision-Infused-Audio-Inpainter-VIAI
Code for Vision-Infused Deep Audio Inpainting (ICCV 2019)
☆58Updated 6 years ago