GeWu-Lab/CSOL_TPAMI2021

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/GeWu-Lab/CSOL_TPAMI2021)

GeWu-Lab / CSOL_TPAMI2021

The repo for "Class-aware Sounding Objects Localization", TPAMI 2021.

☆29

Alternatives and similar repositories for CSOL_TPAMI2021

Users that are interested in CSOL_TPAMI2021 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GeWu-Lab / awesome-audiovisual-learning
View on GitHub
A curated list of audio-visual learning methods and datasets.
☆288Dec 3, 2024Updated last year
GeWu-Lab / MMCosine_ICASSP23
View on GitHub
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆26May 18, 2023Updated 3 years ago
GeWu-Lab / MUSIC-AVQA
View on GitHub
MUSIC-AVQA, CVPR2022 (ORAL)
☆100Dec 30, 2022Updated 3 years ago
DTaoo / Discriminative-Sounding-Objects-Localization
View on GitHub
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆61Jan 19, 2022Updated 4 years ago
DTaoo / Multimodal-Aerial-Scene-Recognition
View on GitHub
Code for <Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition> (ECCV 2020)
☆35Oct 13, 2020Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
GeWu-Lab / MWAFM
View on GitHub
Multi-Scale Attention for Audio Question Answering
☆28Jul 19, 2023Updated 3 years ago
roudimit / MUSIC_dataset
View on GitHub
MUSIC Dataset from The Sound of Pixels (ECCV '18)
☆137Aug 12, 2022Updated 3 years ago
fanyix / SlowFast
View on GitHub
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
☆14Aug 11, 2020Updated 5 years ago
alvinliu0 / Visual-Sound-Localization-in-the-Wild
View on GitHub
Code for Visual Sound Localization in the Wild by Cross-Modal Interference Erasing (AAAI 2022).
☆29Feb 15, 2022Updated 4 years ago
YapengTian / AVVP-ECCV20
View on GitHub
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆90Jul 25, 2024Updated last year
facebookresearch / GDT
View on GitHub
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances…
☆47Aug 29, 2021Updated 4 years ago
ExplainableML / TCAF-GZSL
View on GitHub
This repository contains the code for our ECCV 2022 paper "Temporal and cross-modal attention for audio-visual zero-shot learning"
☆25Sep 12, 2025Updated 10 months ago
MCG-NJU / JoMoLD
View on GitHub
[ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
☆27Jul 15, 2022Updated 4 years ago
jczhang02 / MUSIC_dataset_script
View on GitHub
This repo contains script to download MUSIC dataset from youtube
☆12Jan 19, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
DTaoo / DMC
View on GitHub
Code for Deep Multimodal Clustering for Unsupervised Audiovisual Learning (CVPR2019)
☆15May 27, 2020Updated 6 years ago
afourast / avobjects
View on GitHub
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆114Nov 16, 2020Updated 5 years ago
mispchallenge / misp2021_baseline
View on GitHub
☆29Jun 15, 2022Updated 4 years ago
hche11 / VGGSound
View on GitHub
VGGSound: A Large-scale Audio-Visual Dataset
☆359Sep 13, 2021Updated 4 years ago
mispchallenge / MISP-ICME-AVSR
View on GitHub
☆17Jan 1, 2024Updated 2 years ago
iskwak / DetetctingActionStarts
View on GitHub
Detecting the starting frame of actions in video.
☆10Feb 12, 2020Updated 6 years ago
ashishgupta023 / Real-Time-Eye-Tracking-Interface-and-Gesture-Recognition
View on GitHub
The Project aims at detecting Face, Eye and Eyeballs and then track those eyeballs thus defining actions for various gestures. Apart from…
☆12Dec 30, 2014Updated 11 years ago
GeWu-Lab / Valuate-and-Enhance-Multimodal-Cooperation
View on GitHub
The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024
☆62Nov 5, 2024Updated last year
Wanderlust717 / CARGNet
View on GitHub
[TGRS 2023] Point Label Meets Remote Sensing Change Detection: A Consistency-Aligned Regional Growth Network
☆15Jan 5, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
HuangZiliAndy / RPNSD
View on GitHub
PyTorch implementation of RPNSD
☆60Jun 17, 2024Updated 2 years ago
yh1008 / deepLearning
View on GitHub
MNIST, SVHN, Transfered Learning, DNN, CNN
☆12May 1, 2017Updated 9 years ago
svyas23 / cross-view-action
View on GitHub
Multi-view action recognition using cross-view video synthesis
☆13Jul 25, 2022Updated 3 years ago
krantiparida / awesome-audio-visual
View on GitHub
A curated list of different papers and datasets in various areas of audio-visual processing
☆775Jan 30, 2024Updated 2 years ago
GeWu-Lab / TSPM
View on GitHub
Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.
☆17Oct 25, 2024Updated last year
tqbl / arca23k-dataset
View on GitHub
The code used to create the ARCA23K and ARCA23K-FSD datasets
☆16Nov 9, 2021Updated 4 years ago
aispeech-lab / TinyWASE
View on GitHub
PyTorch implementation of TinyWASE described in our paper "Compressing Speaker Extraction Model with Ultra-low Precision Quantization and…
☆11Jun 28, 2021Updated 5 years ago
speedyseal / audiosetdl
View on GitHub
Scripts for download AudioSet
☆89Nov 7, 2017Updated 8 years ago
chi-kaichen / Trinity-Net
View on GitHub
K. Chi, Y. Yuan, and Q. Wang*, “Trinity-Net: Gradient-Guided Swin Transformer-Based Remote Sensing Image Dehazing and Beyond,” IEEE Trans…
☆11Jan 31, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
mispchallenge / misp2022_baseline
View on GitHub
☆33Jun 26, 2023Updated 3 years ago
GeWu-Lab / OGM-GE_CVPR2022
View on GitHub
The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)
☆320Sep 22, 2025Updated 10 months ago
visipedia / ssw60
View on GitHub
Sapsucker Woods 60 Audiovisual Dataset
☆19Oct 7, 2022Updated 3 years ago
GeWu-Lab / MS-Bot
View on GitHub
The offical repo for "Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation", CoRL 2024 (ORAL)
☆22Jun 25, 2025Updated last year
AwalkZY / CPN
View on GitHub
Code for CVPR2021 Paper “Cascaded Prediction Network via Segment Tree for Temporal Video Grounding”
☆10Apr 3, 2022Updated 4 years ago
zjlww / papers
View on GitHub
Connected Papers knockoff, managing academic papers and citations with graph database.
☆12Dec 26, 2023Updated 2 years ago
facebookresearch / AVID-CMA
View on GitHub
Audio Visual Instance Discrimination with Cross-Modal Agreement
☆133Aug 13, 2021Updated 4 years ago