FeiElysia / ViECapLinks
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023
☆156Updated 10 months ago
Alternatives and similar repositories for ViECap
Users that are interested in ViECap are comparing it to the libraries listed below
Sorting:
- Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval --ICCV2023 Oral☆92Updated last year
- MomentDiff: Generative Video Moment Retrieval from Random to Real--NeurIPS 2023☆79Updated last year
- [IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering☆74Updated 2 years ago
- Accepted by ICCV2023, Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-bas…☆101Updated last year
- Balanced Classification: A Unified Framework for Long-Tailed Object Detection (TMM 2023)☆99Updated 2 months ago
- ☆86Updated 2 years ago
- ☆88Updated last year
- Code release for Your “On-the-fly Category Discovery (CVPR 2023)”☆52Updated 2 years ago
- ☆83Updated last month
- Official implementation of BMVC2023 Oral paper: 《Describe Your Facial Expressions by Linking Image Encoders and Large Language Models》☆62Updated 2 months ago
- "Towards Semi-supervised Learning with Non-random Missing Labels" by Yue Duan (ICCV 2023)☆77Updated 7 months ago
- [ACM MM 2021 Oral] Official repo of "Neighbor-view Enhanced Model for Vision and Language Navigation"☆77Updated 2 years ago
- [CVPR 2024] SimDA: Simple Diffusion Adapter for Efficient Video Generation☆128Updated last year
- [IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering☆19Updated 2 years ago
- Official implementation of "Self-slimmed Vision Transformer" (ECCV2022)☆72Updated 2 years ago
- The implementaion of CoDT on the task of NTU-60+->PKUMMD☆72Updated 2 years ago
- Towards Better Stability and Adaptability: Improve Online Self-Training for Model Adaptation in Semantic Segmentation(CVPR-2023)☆80Updated last year
- [BMVC2023] Spatial and Planar Consistency for Semi-Supervised Volumetric Medical Image Segmentation☆76Updated 9 months ago
- [ICCV 2023] Official implement of <Disentangle then Parse: Night-time Semantic Segmentation with Illumination Disentanglement>☆70Updated last year
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆28Updated 2 months ago
- Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text …☆14Updated 7 months ago
- [CVPR-2023] Official Codes for "TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompte…☆95Updated 11 months ago
- "MutexMatch: Semi-Supervised Learning with Mutex-Based Consistency Regularization" by Yue Duan (TNNLS)☆71Updated 7 months ago
- [TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".☆10Updated 11 months ago
- The code is for PBRnet for action detection☆73Updated 4 years ago
- [NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations☆138Updated last year
- Panoptic Scene Graph Biased Annotation☆35Updated last year
- ☆62Updated last year
- [CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning☆120Updated 6 months ago
- a unified and simple codebase for weakly-supervised temporal action localization☆19Updated last year