haoyi-duan/DG-SCT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/haoyi-duan/DG-SCT)

haoyi-duan / DG-SCT

NeurIPS'2023 official implementation code

☆70

Alternatives and similar repositories for DG-SCT

Users that are interested in DG-SCT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

marmot-xy / CMBS
View on GitHub
cross modal background suppression for audio-visual event localization
☆36Mar 18, 2022Updated 4 years ago
GenjiB / LAVISH
View on GitHub
Vision Transformers are Parameter-Efficient Audio-Visual Learners
☆106Aug 11, 2023Updated 2 years ago
jasongief / CPSP
View on GitHub
[2022 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line
☆32Mar 6, 2023Updated 3 years ago
aspirinone / CATR.github.io
View on GitHub
☆31Mar 1, 2024Updated 2 years ago
ubc-vision / TriBERT
View on GitHub
Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation" in NeurIPS…
☆14Dec 9, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
GeWu-Lab / MMCosine_ICASSP23
View on GitHub
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆26May 18, 2023Updated 3 years ago
ttgeng233 / UnAV
View on GitHub
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
☆73Jan 4, 2026Updated 6 months ago
zhangguanghao523 / CMMCoT
View on GitHub
[AAAI'26] Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augm…
☆11Dec 5, 2025Updated 7 months ago
OpenNLPLab / MMVAE-AVS
View on GitHub
Multimodal Variational Auto-encoder based Audio-Visual Segmentation [ICCV2023].
☆20Sep 19, 2024Updated last year
Franklin905 / VALOR
View on GitHub
Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"
☆17Jul 13, 2025Updated last year
NAVER-INTEL-Co-Lab / gaudi-lavcap
View on GitHub
☆15Jan 24, 2025Updated last year
GeWu-Lab / MUSIC-AVQA
View on GitHub
MUSIC-AVQA, CVPR2022 (ORAL)
☆100Dec 30, 2022Updated 3 years ago
ruohaoguo / ovavss
View on GitHub
Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].
☆37Nov 2, 2024Updated last year
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
weiguoPian / AV-CIL_ICCV2023
View on GitHub
[ICCV 2023] Audio-Visual Class-Incremental Learning
☆35Sep 29, 2024Updated last year
rumc3dlab / 3dlandmarkdetection
View on GitHub
This repository holds the "Fully automated landmarking and facial segmentation on 3D photographs" files
☆29Oct 23, 2023Updated 2 years ago
ku-vai / TPoS
View on GitHub
This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)
☆25Dec 7, 2023Updated 2 years ago
Augusta-A / Awesome-EfficientVideo
View on GitHub
☆12Sep 11, 2021Updated 4 years ago
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
kaiw7 / STG-CMA
View on GitHub
Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation
☆15Apr 13, 2024Updated 2 years ago
GeWu-Lab / Stepping-Stones
View on GitHub
The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
☆18Oct 11, 2024Updated last year
gaozhitong / ATTA
View on GitHub
Code for "ATTA: Anomaly-aware Test-Time Adaptation for Out-of-Distribution Detection in Segmentation" (NeurIPS 23)
☆16Apr 12, 2024Updated 2 years ago
SitongGong / Veason-R1
View on GitHub
Official code of Veason-R1
☆15Jul 14, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
fyyCS / LSLD
View on GitHub
☆14Nov 13, 2023Updated 2 years ago
FloretCat / CMRAN
View on GitHub
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization， ACM MM 2020
☆33Nov 6, 2020Updated 5 years ago
XLearning-SCU / 2025-ICLR-TCR
View on GitHub
Pytorch implementation of "Test-time Adaptation for Cross-modal Retrieval with Query Shift".
☆35Nov 22, 2025Updated 7 months ago
CASIA-IVA-Lab / VALOR
View on GitHub
[TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
☆311Dec 25, 2024Updated last year
stoneMo / AVGN
View on GitHub
Official implementation for AVGN
☆41Mar 24, 2023Updated 3 years ago
stoneMo / DeepAVFusion
View on GitHub
Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".
☆43Aug 2, 2024Updated last year
AronCao49 / Latte
View on GitHub
[ECCV 2024] Reliable Spatial-Temporal Voxels for Multi-Modal Test-Time Adaptation
☆18Jan 12, 2026Updated 6 months ago
lavendery / AudioComposer
View on GitHub
☆27Sep 10, 2025Updated 10 months ago
choijeongsoo / av2av
View on GitHub
[CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
☆48Sep 6, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ypwang61 / StoryEval
View on GitHub
[CVPR2025] Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
☆20May 2, 2025Updated last year
GeWu-Lab / CSOL_TPAMI2021
View on GitHub
The repo for "Class-aware Sounding Objects Localization", TPAMI 2021.
☆29Mar 4, 2022Updated 4 years ago
zhoujiahuan1991 / AAAI2025-SVP
View on GitHub
☆18Apr 18, 2025Updated last year
ZikunZhou / GTELT
View on GitHub
An official implementation for "Global Tracking via Ensemble of Local Trackers"
☆11Mar 13, 2022Updated 4 years ago
chenhaoxing / ASL
View on GitHub
This repository is the code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (IEEE SPL)".
☆10Mar 14, 2023Updated 3 years ago
XLearning-SCU / 2024-ICLR-READ
View on GitHub
Pytorch implementation of "Test-time Adaption against Multi-modal Reliability Bias".
☆54Dec 24, 2024Updated last year
fredfung007 / snlt
View on GitHub
☆15Dec 3, 2021Updated 4 years ago