[ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
☆41Apr 11, 2025Updated 11 months ago
Alternatives and similar repositories for EgoCVR
Users that are interested in EgoCVR are comparing it to the libraries listed below
Sorting:
- Composed Video Retrieval☆63May 2, 2024Updated last year
- Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".☆119Oct 9, 2025Updated 5 months ago
- [ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"☆84Jul 4, 2024Updated last year
- ☆20Jul 28, 2025Updated 7 months ago
- Official Code for Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex Interactions☆16Dec 27, 2023Updated 2 years ago
- [SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval☆44Jul 14, 2024Updated last year
- ☆10Feb 13, 2025Updated last year
- Collection of Composed Image Retrieval (CIR) papers.☆322Dec 22, 2025Updated 2 months ago
- [ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion☆197Jul 31, 2025Updated 7 months ago
- Code for the paper "If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection"☆27Jul 10, 2023Updated 2 years ago
- This is the official implementation of RGNet: A Unified Retrieval and Grounding Network for Long Videos☆19Mar 3, 2025Updated last year
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆56May 27, 2025Updated 9 months ago
- Codes of the Fine-grained Textual Inversion network for Zero-Shot Composed Image Retrieval☆27Apr 7, 2025Updated 11 months ago
- [ICLR'25] Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?☆12Apr 11, 2025Updated 11 months ago
- (ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning☆28Sep 27, 2024Updated last year
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆86Aug 6, 2025Updated 7 months ago
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos☆33May 27, 2025Updated 9 months ago
- Official PyTorch code of GroundVQA (CVPR'24)☆64Sep 13, 2024Updated last year
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆144Jan 5, 2026Updated 2 months ago
- Pytorch implementation for Egoinstructor at CVPR 2024☆28Dec 1, 2024Updated last year
- In this codebase we establish a benchmark for egocentric user adaptation based on Ego4d.First, we start from a population model which ha…☆15Jan 16, 2025Updated last year
- Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval -- AAAI2025☆17Jul 14, 2025Updated 8 months ago
- [ICLR 2025] This repo is the official implementation of our paper "Learning Fine-Grained Representations through Textual Token Disentangl…☆23Jul 28, 2025Updated 7 months ago
- Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024☆58Aug 19, 2025Updated 7 months ago
- Data release for Step Differences in Instructional Video (CVPR24)☆14Jun 19, 2024Updated last year
- ICCV'23 Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval☆19Aug 22, 2025Updated 6 months ago
- Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024☆21May 30, 2024Updated last year
- Source code of our MM'22 paper Partially Relevant Video Retrieval☆55Nov 4, 2024Updated last year
- [CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Lan…☆63Mar 25, 2025Updated 11 months ago
- [CVPR 2024] Data and benchmark code for the EgoExoLearn dataset☆81Aug 26, 2025Updated 6 months ago
- Human-centric environment representations from egocentric video☆14Feb 5, 2026Updated last month
- [2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval☆42Sep 23, 2021Updated 4 years ago
- [ECCV2024] The official implementation of "Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation".☆13Feb 24, 2025Updated last year
- HD-EPIC Python script to download the entire datasets or parts of it☆17Oct 7, 2025Updated 5 months ago
- Egocentric Video Understanding Dataset (EVUD)☆33Jul 4, 2024Updated last year
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"☆25Feb 2, 2025Updated last year
- [SIGIR'2024 Best Paper Honorable Mention] Official repository for "LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Compose…☆73Mar 14, 2025Updated last year
- ☆197Mar 5, 2025Updated last year
- The official implementation of Error Detection in Egocentric Procedural Task Videos☆22Sep 20, 2025Updated 6 months ago