usc-sail/mica-subtitle-aligned-movie-sounds

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/usc-sail/mica-subtitle-aligned-movie-sounds)

usc-sail / mica-subtitle-aligned-movie-sounds

A dataset for Audio-Visual Sound Event Detection in Movies

☆26

Alternatives and similar repositories for mica-subtitle-aligned-movie-sounds

Users that are interested in mica-subtitle-aligned-movie-sounds are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

965694547 / Hybrid-system-of-frame-wise-model-and-SEDT
View on GitHub
☆28Mar 14, 2023Updated 3 years ago
epic-kitchens / epic-sounds-annotations
View on GitHub
Splits for epic-sounds dataset
☆85Aug 2, 2025Updated 11 months ago
Robiwan245 / SiamMAE
View on GitHub
☆12Mar 5, 2024Updated 2 years ago
sony / CLIPSep
View on GitHub
☆43Feb 21, 2023Updated 3 years ago
visipedia / ssw60
View on GitHub
Sapsucker Woods 60 Audiovisual Dataset
☆19Oct 7, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
vivoutlaw / tcbp
View on GitHub
Temporal Compact Bilinear Pooling (TCBP)
☆11May 27, 2020Updated 6 years ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
GenjiB / LAVISH
View on GitHub
Vision Transformers are Parameter-Efficient Audio-Visual Learners
☆107Aug 11, 2023Updated 2 years ago
YuanGongND / uavm
View on GitHub
Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".
☆57Apr 20, 2023Updated 3 years ago
naba89 / iSeparate-SDX
View on GitHub
iSeparate library for the SDX2023 challenge
☆15Dec 15, 2023Updated 2 years ago
MTG / PodcastMix-inference
View on GitHub
☆32Jan 6, 2022Updated 4 years ago
GeWu-Lab / MMCosine_ICASSP23
View on GitHub
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆26May 18, 2023Updated 3 years ago
JustinYuu / MM_Pyramid
View on GitHub
[ACM MM 2022] MM_Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing
☆15Aug 26, 2022Updated 3 years ago
davidliujiafeng / ccom_mdx2023
View on GitHub
☆10Jun 6, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
TencentYoutuResearch / HighlightDetection-CLC
View on GitHub
Code for CVPR2023 paper "Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies"
☆18Mar 21, 2023Updated 3 years ago
archinetai / audio-encoders-pytorch
View on GitHub
A collection of audio autoencoders, in PyTorch.
☆44Mar 7, 2023Updated 3 years ago
usc-sail / mica-speech-activity-detection
View on GitHub
Robust Speech Activity Detection (SAD) in movie audio
☆26Jan 27, 2021Updated 5 years ago
j-bernardi / psds_eval
View on GitHub
Polyphonic Sound Detection Score (PSDS)
☆20Jan 20, 2020Updated 6 years ago
MANLP-suda / HHMPN
View on GitHub
Multi-modal Multi-label Emotion Recognition with Heterogeneous Hierarchical Message Passing
☆18Sep 24, 2022Updated 3 years ago
PardoAlejo / LearningToCut
View on GitHub
Official Code of ICCV 2021 Paper: Learning to Cut by Watching Movies
☆51Nov 9, 2022Updated 3 years ago
jasongief / PSP_CVPR_2021
View on GitHub
[2021 CVPR] Positive Sample Propagation along the Audio-Visual Event Line
☆42Jul 5, 2022Updated 4 years ago
speedyseal / audiosetdl
View on GitHub
Scripts for download AudioSet
☆89Nov 7, 2017Updated 8 years ago
OpenGVLab / LORIS
View on GitHub
[ICML2023] Long-Term Rhythmic Video Soundtracker
☆63Jul 28, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
m-bain / CondensedMovies-chall
View on GitHub
Condensed Movies Challenge 2021
☆22Sep 21, 2022Updated 3 years ago
RetroCirce / HTS-Audio-Transformer
View on GitHub
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
☆504Sep 18, 2025Updated 10 months ago
apple-yinhan / TQ-SED
View on GitHub
☆24Mar 19, 2025Updated last year
audio-captioning / audio-captioning-resources
View on GitHub
A list of resources that can help in research for automated audio captioning
☆34Feb 17, 2021Updated 5 years ago
abruckert / eye_tracking_filmmaking
View on GitHub
☆24Feb 25, 2021Updated 5 years ago
wsntxxn / AudioCaption
View on GitHub
Audio captioning recipe
☆53Oct 23, 2025Updated 9 months ago
DCASE-REPO / DESED_task
View on GitHub
Domestic environment sound event detection task
☆157Jun 11, 2024Updated 2 years ago
KimberleyJensen / kmdx-net_music-source-separation
View on GitHub
☆34May 15, 2023Updated 3 years ago
geforcefan / libnolimits
View on GitHub
A NoLimits Roller Coaster 1 and 2 Library written in C++
☆12Feb 16, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
yangdongchao / Tim-TSENet
View on GitHub
The source code of Tim-TSENet
☆15Apr 22, 2022Updated 4 years ago
iclr2024mcmi / ICLRMCMI
View on GitHub
Official implementation of Bayes Conditional Distribution Estimation for Knowledge Distillation Based on Conditional Mutual Information
☆12Sep 28, 2023Updated 2 years ago
YapengTian / AVVP-ECCV20
View on GitHub
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆90Jul 25, 2024Updated 2 years ago
YapengTian / AVE-ECCV18
View on GitHub
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
☆210Apr 3, 2021Updated 5 years ago
jinxiang-liu / anno-free-AVS
View on GitHub
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
☆38Oct 11, 2024Updated last year
XinhaoMei / WavCaps
View on GitHub
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
☆264Jul 25, 2024Updated 2 years ago
haiciyang / Remixing
View on GitHub
Official repo of ICASSP 2022 paper - Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization
☆20Jan 7, 2025Updated last year