EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset
☆59Nov 23, 2020Updated 5 years ago
Alternatives and similar repositories for EgoCom-Dataset
Users that are interested in EgoCom-Dataset are comparing it to the libraries listed below
Sorting:
- [CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'☆13Jun 16, 2024Updated last year
- ☆21Feb 15, 2022Updated 4 years ago
- Code accompanying EGO-TOPO: Environment Affordances from Egocentric Video (CVPR 2020)☆31Aug 3, 2022Updated 3 years ago
- ☆67Sep 13, 2022Updated 3 years ago
- ☆16Apr 10, 2019Updated 6 years ago
- ☆10Jul 24, 2019Updated 6 years ago
- Companion toolkit of the 'Serial Speakers' dataset.☆11Feb 17, 2020Updated 6 years ago
- Implementation of FixMatch in PyTorch and experimentations☆12Aug 9, 2020Updated 5 years ago
- Official repo for the STRFNet system appeared in INTERSPEECH2020☆12Mar 6, 2021Updated 5 years ago
- Code for https://arxiv.org/abs/1712.00254☆16Dec 6, 2017Updated 8 years ago
- An Android app that listens to conversations and determines who was speaking at any point in the conversation - a task known as speech di…☆14Apr 12, 2021Updated 4 years ago
- Inferring Body Pose in Egocentric Video via First and Second Person Interactions☆51Aug 31, 2021Updated 4 years ago
- The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…☆129Dec 4, 2023Updated 2 years ago
- ☆20Nov 3, 2021Updated 4 years ago
- Anonymous ICLR Submission☆14Sep 25, 2019Updated 6 years ago
- Example workflow for our data-centric speech benchmark☆17Jul 6, 2023Updated 2 years ago
- Wave-U-Net for automatic (drum) mixing☆38Mar 24, 2023Updated 2 years ago
- Text-based media editing interface☆16Aug 9, 2017Updated 8 years ago
- [ICLR 2019] Learning Factorized Multimodal Representations☆67Aug 4, 2020Updated 5 years ago
- DCASE2020 Challenge Task 2 baseline variants☆21Apr 2, 2020Updated 5 years ago
- Code repo for "Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection"☆17Nov 9, 2022Updated 3 years ago
- Simple baseline model for the HEAR benchmark☆23Feb 17, 2026Updated 2 weeks ago
- Experiment in automatic insertion of timed transcript corrections☆21Oct 31, 2017Updated 8 years ago
- ☆22Jun 30, 2021Updated 4 years ago
- Real-time Speech Separation, Noise Suppression & Speaker Recognition☆18Apr 17, 2019Updated 6 years ago
- Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Buil…☆40Jun 17, 2025Updated 8 months ago
- ☆49Nov 24, 2022Updated 3 years ago
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆43Jul 17, 2020Updated 5 years ago
- A synthetic dataset of dialogs we authored and annotated for references (pronouns, etc.). This dataset is discussed in the paper "MuDoCo:…☆24Mar 24, 2022Updated 3 years ago
- Evaluation script for VoxMovies dataset in PyTorch☆23Jan 12, 2024Updated 2 years ago
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago
- Repository containg experiments with Extreme Learning Machines And Reservoir Computing, ELMARC.☆20May 1, 2018Updated 7 years ago
- [CVPR 2023] Egocentric Audio-Visual Object Localization☆26Jan 6, 2024Updated 2 years ago
- Annotations for the public release of the EPIC-KITCHENS-100 dataset☆166Aug 1, 2022Updated 3 years ago
- Python scripts to download Assembly101 from Google Drive☆64Oct 10, 2024Updated last year
- ☆31Feb 24, 2023Updated 3 years ago
- Multi-Target Embodied Question Answering☆26Jul 17, 2020Updated 5 years ago
- Website for the ISMIR 2023 Tutorial: Few-shot and Zero-shot Learning for MIR☆30Jan 3, 2023Updated 3 years ago
- Zero-Resource Speech Discovery, Search, and Evaluation Tools☆29Aug 6, 2015Updated 10 years ago