yingchengy / AVMOEView external linksLinks
[NeurIPS 2024] Mixture of Experts for Audio-Visual Learning
☆23Jan 19, 2025Updated last year
Alternatives and similar repositories for AVMOE
Users that are interested in AVMOE are comparing it to the libraries listed below
Sorting:
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆37Oct 11, 2024Updated last year
- Transactions on Multimedia (TMM25)☆19Apr 8, 2025Updated 10 months ago
- The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024☆50Oct 12, 2025Updated 4 months ago
- Official Repository of 'Multi-Scale Temporal Mamba for Efficient Temporal Action Detection'☆34Jan 23, 2026Updated 3 weeks ago
- Official implementation of paper "OED: Towards One-stage End-to-End Dynamic Scene Graph Generation".☆26Mar 26, 2024Updated last year
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆55Jul 1, 2025Updated 7 months ago
- ☆28Apr 8, 2025Updated 10 months ago
- Official Pytorch Implementation of the framework TEMPURA proposed in our paper Unbiased Scene Graph Generation in Videos accepted by CVPR…☆24Sep 9, 2025Updated 5 months ago
- For Ego4D VQ3D Task☆22Jan 9, 2024Updated 2 years ago
- Offical implemention of the paper DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction☆29May 26, 2024Updated last year
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆35Nov 2, 2024Updated last year
- [AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer☆73Mar 6, 2025Updated 11 months ago
- Code for paper 'Leveraging Predicate and Triplet Learning for Scene Graph Generation'. (CVPR 2024)☆32Sep 6, 2025Updated 5 months ago
- ☆17Sep 23, 2025Updated 4 months ago
- The official implementation of our work Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanc…☆12Oct 14, 2024Updated last year
- ☆33Feb 29, 2024Updated last year
- ☆32Mar 1, 2024Updated last year
- [ECCV 2024] Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation☆35Jan 6, 2025Updated last year
- [ICML2024]The official implementation of SemiRES in PyTorch.☆33Jun 20, 2024Updated last year
- [TIP2025] The implementation of "Uncertainty Guided Refinement for Fine-grained Salient Object Detection"☆15Apr 20, 2025Updated 9 months ago
- The repository of VG-Refiner paper☆17Dec 9, 2025Updated 2 months ago
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆18Jul 10, 2025Updated 7 months ago
- ☆12Feb 7, 2018Updated 8 years ago
- The core library of the DFKI multisensor pipeline framework.☆11May 23, 2022Updated 3 years ago
- This is the official repository for the paper "Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction". ICCV …☆23Dec 4, 2025Updated 2 months ago
- ☆11Jan 18, 2025Updated last year
- ☆10Apr 7, 2025Updated 10 months ago
- Contains implementation of the DoubIL and ResiduIL algorithms from the ICML '22 paper Causal Imitation Learning under Temporally Correlat…☆11Dec 9, 2022Updated 3 years ago
- code for LSN☆10Oct 28, 2024Updated last year
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- [CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…☆40Apr 20, 2025Updated 9 months ago
- [NeurIPS 2025] Panoptic Captioning: An Equivalence Bridge for Image and Text☆33Jan 31, 2026Updated 2 weeks ago
- Source code of the paper "The NeRF Signature: Codebook-Aided Watermarking for Neural Radiance Fields".☆17Mar 3, 2025Updated 11 months ago
- Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"☆14Feb 24, 2025Updated 11 months ago
- Audio-Visual Perception of Omnidirectional Video for Virtual Reality Applications☆15Feb 22, 2023Updated 2 years ago
- Progressive Language-guided Visual Learning for Multi-Task Visual Grounding☆13May 9, 2025Updated 9 months ago
- Agentic Keyframe Search for Video Question Answering☆15Apr 7, 2025Updated 10 months ago
- ☆13Jan 21, 2025Updated last year