facebookresearch / soundspaces-challenge
Starter code for SoundSpaces challenge at CVPR 21's Embodied AI workshop
☆12Updated last year
Related projects ⓘ
Alternatives and complementary repositories for soundspaces-challenge
- ☆23Updated 4 years ago
- EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset☆52Updated 3 years ago
- VisualEchoes Dataset (ECCV 2020)☆34Updated 3 years ago
- A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.☆48Updated 5 months ago
- Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"☆26Updated 2 years ago
- Audio propagation engine - Meta Reality Labs Research.☆17Updated 2 years ago
- Code and datasets for 'Move2Hear: Active Audio-Visual Source Separation' (ICCV 2021)☆14Updated last year
- Official PyTorch implementation of "Improving Generative Imagination in Object-Centric World Models"☆34Updated last year
- Evaluation script for VoxMovies dataset in PyTorch☆22Updated 10 months ago
- [ICLR 2021] Beyond Categorical Label Representations for Image Classification☆25Updated 2 years ago
- Repo for Visual Acoustic Matching, CVPR 2022☆65Updated last year
- Code for Look for the Change paper published at CVPR 2022☆35Updated 2 years ago
- Self-supervised algorithm for learning representations from ego-centric video data. Code is tested on EPIC-Kitchens-100 and Ego4D in PyTo…☆11Updated 2 years ago
- CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning☆103Updated 3 years ago
- A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts …☆13Updated 2 years ago
- ☆33Updated 10 months ago
- ☆39Updated 10 months ago
- SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)☆16Updated 2 years ago
- multimodal video-audio-text generation and retrieval between every pair of modalities on the MUGEN dataset. The repo. contains the traini…☆39Updated last year
- RareAct: A video dataset of unusual interactions☆32Updated 4 years ago
- Code for Improved Condtional VRNNs for Video Prediction☆37Updated 3 years ago
- Unofficial Implementation of Google Deepmind's paper `Objects that Sound`☆83Updated 6 years ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆35Updated 10 months ago
- 2.5D visual sound dataset☆92Updated 3 years ago
- Code for the paper Learning the Predictability of the Future (CVPR 2021)☆161Updated last year
- Project website for "Telling left from right: Learning spatial correspondence between sight and sound"☆20Updated 2 years ago
- Gym wrapper for Vizdoom environments☆12Updated 5 years ago
- Codes for paper <InteL-VAEs: Adding Inductive Biases to VariationalAuto-Encoders via Intermediary Latents>.☆18Updated 3 years ago
- This is the pytorch version of tcc loss, used in paper 'Temporal Cycle-Consistency Learning'.☆25Updated 4 years ago
- [NeurIPS 2021 Spotlight] Learning to Compose Visual Relations☆101Updated last year