enyac-group / T-VSL

☆13

Alternatives and similar repositories for T-VSL:

Users that are interested in T-VSL are comparing it to the libraries listed below

schowdhury671 / meerkat
☆27Updated 6 months ago
jinxiang-liu / anno-free-AVS
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
☆29Updated 6 months ago
yannqi / COMBO-AVS
[CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…
☆39Updated this week
GenjiB / LAVISH
Vision Transformers are Parameter-Efficient Audio-Visual Learners
☆99Updated last year
vvvb-github / AVSegFormer
[AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer
☆63Updated last month
Franklin905 / VALOR
Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"
☆18Updated last year
GeWu-Lab / Generalizable-Audio-Visual-Segmentation
Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024
☆19Updated last year
rikeilong / Bay-CAT
[ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenario…
☆52Updated 7 months ago
fyyCS / LSLD
☆14Updated last year
GeWu-Lab / Crab
☆13Updated last month
ttgeng233 / UnAV
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
☆63Updated last year
stoneMo / DeepAVFusion
Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".
☆31Updated 8 months ago
JacobChalk / TIM
Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"
☆39Updated 5 months ago
GeWu-Lab / MMCosine_ICASSP23
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆19Updated last year
GeWu-Lab / TSPM
Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.
☆16Updated 5 months ago
swimmiing / ACL-SSL
Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"
☆16Updated 2 months ago
jasongief / OV-AVEL
[2025 CVPR] Towards Open-Vocabulary Audio-Visual Event Localization
☆17Updated last month
haoyi-duan / DG-SCT
NeurIPS'2023 official implementation code
☆61Updated last year
stoneMo / CIGN
Official implementation for CIGN
☆15Updated last year
stoneMo / EZ-VSL
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
☆33Updated 2 years ago
GeWu-Lab / Stepping-Stones
The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
☆14Updated 6 months ago
ttgeng233 / UniAV
Unified Audio-Visual Perception for Multi-Task Video Localization
☆24Updated last year
GeWu-Lab / Ref-AVS
The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024
☆37Updated 4 months ago
sangmin-git / MMSI
Code for "Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations" (CVPR 2024 Oral)
☆16Updated 9 months ago
gyx-gloria / DMT
Official Implementation of DMT: Dual Mean-Teacher in PyTorch.
☆12Updated last year
hxixixh / mix-and-localize
☆20Updated last year
weiguoPian / AV-CIL_ICCV2023
☆22Updated 6 months ago
aspirinone / CATR.github.io
☆32Updated last year
jasongief / CPSP
[2023 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line
☆29Updated 2 years ago
stoneMo / AVGN
Official implementation for AVGN
☆34Updated 2 years ago