epic-kitchens/epic-sounds-annotations

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/epic-kitchens/epic-sounds-annotations)

epic-kitchens / epic-sounds-annotations

Splits for epic-sounds dataset

☆85

Alternatives and similar repositories for epic-sounds-annotations

Users that are interested in epic-sounds-annotations are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

epic-kitchens / epic-kitchens-100-object-masks
View on GitHub
Support library for the MaskRCNN masks extracted on EPIC-KITCHENS-100
☆14Dec 1, 2020Updated 5 years ago
JacobChalk / TIM
View on GitHub
Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"
☆54Nov 7, 2024Updated last year
usc-sail / mica-subtitle-aligned-movie-sounds
View on GitHub
A dataset for Audio-Visual Sound Event Detection in Movies
☆26Jan 23, 2023Updated 3 years ago
ttgeng233 / UnAV
View on GitHub
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
☆73Jan 4, 2026Updated 6 months ago
ekazakos / auditory-slow-fast
View on GitHub
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch
☆73Sep 27, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
epic-kitchens / epic-kitchens-100-narrator
View on GitHub
Video narrator written in Python/GTK using vlc-lib
☆25Jun 22, 2022Updated 4 years ago
OpenGVLab / perception_test_iccv2023
View on GitHub
Champion Solutions repository for Perception Test challenges in ICCV2023 workshop.
☆14Oct 18, 2023Updated 2 years ago
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Jul 16, 2026Updated last week
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
EGO4D / audio-visual
View on GitHub
☆69Sep 13, 2022Updated 3 years ago
epic-kitchens / epic-kitchens-100-hand-object-bboxes
View on GitHub
A repo for processing the raw hand object detections to produce releasable pickles + library for using these
☆40Oct 26, 2024Updated last year
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
akoepke / audio-retrieval-benchmark
View on GitHub
Code for "Audio Retrieval with Natural Language Queries: A Benchmark Study", Transactions on Multimedia 2022
☆54Jul 16, 2025Updated last year
Franklin905 / VALOR
View on GitHub
Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"
☆17Jul 13, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
GeWu-Lab / awesome-audiovisual-learning
View on GitHub
A curated list of audio-visual learning methods and datasets.
☆289Dec 3, 2024Updated last year
epic-kitchens / epic-kitchens-100-annotations
View on GitHub
Annotations for the public release of the EPIC-KITCHENS-100 dataset
☆173Aug 1, 2022Updated 3 years ago
YuanGongND / uavm
View on GitHub
Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".
☆57Apr 20, 2023Updated 3 years ago
JaesungHuh / VoxMovies
View on GitHub
Evaluation script for VoxMovies dataset in PyTorch
☆23Jan 12, 2024Updated 2 years ago
facebookresearch / daqa
View on GitHub
Temporal Reasoning via Audio Question Answering
☆27Dec 21, 2019Updated 6 years ago
cdjkim / audiocaps
View on GitHub
🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps
☆215Oct 6, 2025Updated 9 months ago
ftshijt / Interspeech2024_DiscreteSpeechChallenge
View on GitHub
This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.
☆32Jan 26, 2024Updated 2 years ago
v-iashin / SpecVQGAN
View on GitHub
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
☆372Jul 12, 2024Updated 2 years ago
ttgeng233 / UniAV
View on GitHub
Unified Audio-Visual Perception for Multi-Task Video Localization
☆33Apr 19, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
TdP-2025 / TdP-2025
View on GitHub
☆12Jul 22, 2025Updated last year
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Updated this week
mira-ai-lab / MUSIC-AVQA-R
View on GitHub
☆13May 21, 2024Updated 2 years ago
SuperKogito / pydiogment
View on GitHub
Python library for audio augmentation
☆84Jul 6, 2023Updated 3 years ago
stoneMo / AVGN
View on GitHub
Official implementation for AVGN
☆42Mar 24, 2023Updated 3 years ago
liuhuadai / ViT-TTS
View on GitHub
PyTorch Implementation of ViT-TTS (EMNLP'23)
☆11Oct 20, 2023Updated 2 years ago
hazeld / action-modifiers
View on GitHub
Code for the CVPR 2020 paper 'Action Modifiers: Learning from Adverbs in Instructional Videos'
☆23May 17, 2021Updated 5 years ago
epic-kitchens / VISOR-VIS
View on GitHub
Visualisation of VISOR Segmentations with Annotations and Relations
☆22Aug 15, 2022Updated 3 years ago
yangdongchao / Text-to-sound-Synthesis
View on GitHub
The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"
☆366Aug 3, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
GeWu-Lab / MMCosine_ICASSP23
View on GitHub
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆26May 18, 2023Updated 3 years ago
XYPB / CondFoleyGen
View on GitHub
Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".
☆93Dec 8, 2023Updated 2 years ago
SAGNIKMJR / ego-AV-spatial-correspondence
View on GitHub
[CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'
☆14Jun 16, 2024Updated 2 years ago
GenjiB / LAVISH
View on GitHub
Vision Transformers are Parameter-Efficient Audio-Visual Learners
☆107Aug 11, 2023Updated 2 years ago
RickyL-2000 / AlignSTS
View on GitHub
Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignment
☆68Jul 5, 2024Updated 2 years ago
bmcfee / ccrma2018_notebooks
View on GitHub
Extra notebooks for CCRMA MIR workshop, 2018 edition
☆13Jun 28, 2018Updated 8 years ago
Chiaraplizz / ARGO1M-What-can-a-cook
View on GitHub
☆11Jul 14, 2023Updated 3 years ago