jasongief/CPSP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jasongief/CPSP)

jasongief / CPSP

[2022 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line

☆32

Alternatives and similar repositories for CPSP

Users that are interested in CPSP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jasongief / TGS-Agent
View on GitHub
[2026 AAAI] Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation
☆20Nov 8, 2025Updated 8 months ago
marmot-xy / CMBS
View on GitHub
cross modal background suppression for audio-visual event localization
☆36Mar 18, 2022Updated 4 years ago
jasongief / OV-AVEL
View on GitHub
[2025 CVPR] Towards Open-Vocabulary Audio-Visual Event Localization
☆46Mar 7, 2025Updated last year
jasongief / PSP_CVPR_2021
View on GitHub
[2021 CVPR] Positive Sample Propagation along the Audio-Visual Event Line
☆42Jul 5, 2022Updated 4 years ago
FloretCat / CMRAN
View on GitHub
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization， ACM MM 2020
☆33Nov 6, 2020Updated 5 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ubc-vision / TriBERT
View on GitHub
Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation" in NeurIPS…
☆14Dec 9, 2021Updated 4 years ago
jasongief / Mettle
View on GitHub
[2025 TPAMI] Mettle: Meta-Token Learning for Memory-Efficient Audio-Visual Adaptation
☆18Jan 3, 2026Updated 6 months ago
Minato-Zackie / SMoLoRA
View on GitHub
This is the official code implementation of "SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction …
☆17Feb 27, 2026Updated 4 months ago
GeWu-Lab / MUSIC-AVQA
View on GitHub
MUSIC-AVQA, CVPR2022 (ORAL)
☆100Dec 30, 2022Updated 3 years ago
ttgeng233 / UnAV
View on GitHub
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
☆73Jan 4, 2026Updated 6 months ago
YapengTian / AVVP-ECCV20
View on GitHub
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆90Jul 25, 2024Updated last year
kunli-cs / PCAN
View on GitHub
[AAAI 2025] Official implementation of the paper: Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition
☆15Jul 16, 2025Updated last year
MengyuanChen21 / CVPR2023-CMPAE
View on GitHub
[CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
☆37Jun 17, 2023Updated 3 years ago
xiaobai1217 / DomainAdaptation
View on GitHub
CVPR2022
☆23Jul 27, 2022Updated 3 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
GenjiB / LAVISH
View on GitHub
Vision Transformers are Parameter-Efficient Audio-Visual Learners
☆107Aug 11, 2023Updated 2 years ago
fyyCS / LSLD
View on GitHub
☆14Nov 13, 2023Updated 2 years ago
haoyi-duan / DG-SCT
View on GitHub
NeurIPS'2023 official implementation code
☆70Nov 11, 2023Updated 2 years ago
MengyuanChen21 / Re-EDL
View on GitHub
[TPAMI 2025] Revisiting Essential and Non-Essential Settings of Evidential Deep Learning
☆26Jun 24, 2025Updated last year
JustinYuu / MACIL_SD
View on GitHub
[ACM MM 2022] Modality-aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection
☆42Jul 13, 2022Updated 4 years ago
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
OpenNLPLab / AVSBench
View on GitHub
[ECCV 2022] & [IJCV 2024] Official implementation of the paper: Audio-Visual Segmentation (with Semantics)
☆420Nov 18, 2024Updated last year
MCG-NJU / JoMoLD
View on GitHub
[ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
☆27Jul 15, 2022Updated 4 years ago
ttgeng233 / UniAV
View on GitHub
Unified Audio-Visual Perception for Multi-Task Video Localization
☆33Apr 19, 2024Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
MengyuanChen21 / CVPR2023-OWTAL
View on GitHub
[CVPR 2023] Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization
☆12Jul 9, 2024Updated 2 years ago
brian7685 / Multimodal-Clustering-Network
View on GitHub
ICCV 2021
☆34May 11, 2022Updated 4 years ago
hche11 / VGGSound
View on GitHub
VGGSound: A Large-scale Audio-Visual Dataset
☆359Sep 13, 2021Updated 4 years ago
kaiw7 / STG-CMA
View on GitHub
Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation
☆15Apr 13, 2024Updated 2 years ago
yujiangpu20 / cma_xdVioDet
View on GitHub
Official code for "Audio-Guided Attention Network for Weakly Supervised Violence Detection" (ICCECE2022).
☆13Mar 25, 2022Updated 4 years ago
jinbae-s / ACVIS
View on GitHub
[ICASSP 2026] The official pytorch implementation of ACVIS
☆15Jan 19, 2026Updated 6 months ago
zhangguanghao523 / CMMCoT
View on GitHub
[AAAI'26] Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augm…
☆11Dec 5, 2025Updated 7 months ago
MCG-NJU / PointTAD
View on GitHub
[NeurIPS 2022] PointTAD: Multi-Label Temporal Action Detection with Learnable Query Points
☆48Nov 24, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
KlingAIResearch / VidEmo
View on GitHub
[NeurIPS'25] VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models
☆15Dec 7, 2025Updated 7 months ago
MengyuanChen21 / NeurIPS2024-CSP
View on GitHub
[NeurIPS 2024] Conjugated Semantic Pool Improves OOD Detection with Pre-trained Vision-Language Models
☆40Oct 17, 2024Updated last year
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year
michaelneri / unsupervised-audio-anomaly-detection
View on GitHub
Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …
☆11Nov 6, 2024Updated last year
zihuixue / MFH
View on GitHub
[ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation
☆44Jul 10, 2023Updated 3 years ago
Lzq5 / Video-Text-Alignment
View on GitHub
☆28Jul 18, 2025Updated last year
GeWu-Lab / LFAV
View on GitHub
Towards Long Form Audio-visual Video Understanding
☆15Jan 16, 2026Updated 6 months ago