Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query (ICCV2021)
☆20Dec 4, 2021Updated 4 years ago
Alternatives and similar repositories for Ask-Confirm
Users that are interested in Ask-Confirm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2019] Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries☆12Apr 15, 2022Updated 3 years ago
- Who's Waldo? Linking People Across Text and Images. ICCV 2021.☆13May 17, 2023Updated 2 years ago
- ☆12Mar 12, 2023Updated 3 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆43Jul 15, 2022Updated 3 years ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆25Nov 23, 2024Updated last year
- ☆11Dec 24, 2020Updated 5 years ago
- [Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training☆22Mar 19, 2022Updated 4 years ago
- Data release for Step Differences in Instructional Video (CVPR24)☆14Jun 19, 2024Updated last year
- code for AAAI21 paper "Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion“☆28Jan 7, 2021Updated 5 years ago
- Extracting optical flow based on GPU in Opencv3☆12Jul 29, 2019Updated 6 years ago
- Official repository of ICCV 2021 - Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models☆131Oct 17, 2025Updated 5 months ago
- Adaptive Offline Quintuplet Loss for Image-Text Matching (AOQ)☆34Jul 2, 2020Updated 5 years ago
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆25Jun 4, 2025Updated 9 months ago
- Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"☆63Apr 16, 2021Updated 4 years ago
- Official code for WACV 2021 paper - Compositional Learning of Image-Text Query for Image Retrieval☆56Oct 8, 2021Updated 4 years ago
- Dynamic Modality Interaction Modeling for Image-Text Retrieval. SIGIR'21☆70May 26, 2022Updated 3 years ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆61Jun 12, 2023Updated 2 years ago
- Modality-Agnostic Attention Fusion for visual search with text feedback☆25Mar 21, 2023Updated 3 years ago
- Implementation of our ACMMM2019 paper, Focus Your Attention: A Bidirectional Focal Attention Network for Image-Text Matching☆39Jun 19, 2023Updated 2 years ago
- Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024☆21May 30, 2024Updated last year
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆38Aug 18, 2024Updated last year
- Placeholder for code of BSP.☆11Aug 13, 2021Updated 4 years ago
- ☆18Jan 4, 2024Updated 2 years ago
- Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for …☆61Oct 21, 2022Updated 3 years ago
- Learning Cross-Modal Retrieval with Noisy Labels (CVPR 2021, PyTorch Code)☆55Mar 5, 2023Updated 3 years ago
- One-Pixel Shortcut: on the Learning Preference of Deep Neural Networks (ICLR 2023 Spotlight)☆14Sep 28, 2025Updated 5 months ago
- ☆10Jul 28, 2022Updated 3 years ago
- [CVPR 2022] Visual Abductive Reasoning☆124Oct 22, 2024Updated last year
- Code for AAAI 2020 paper Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect.☆20Jan 7, 2021Updated 5 years ago
- Code and benchmarks for the Semantic Video Retrieval Task☆53Oct 18, 2022Updated 3 years ago
- Official implementation of the Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) | ICCV 2021 - Image Retrieval o…☆40Jun 26, 2024Updated last year
- ☆30May 7, 2021Updated 4 years ago
- Official repository of "Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach" (ACL 2024 Oral)☆35Mar 24, 2025Updated last year
- Learning from noisy labels via regularization between representations☆11Feb 28, 2023Updated 3 years ago
- Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression☆14Mar 22, 2025Updated last year
- Ladder Loss for Coherent Visual-Semantic Embedding, AAAI, 2020☆13Aug 14, 2021Updated 4 years ago
- The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretr…☆445Sep 25, 2025Updated 5 months ago
- Code for AAAI 2020 paper "Asymmetric Co-Teaching for Unsupervised Cross Domain Person Re-Identification"☆20Oct 5, 2022Updated 3 years ago
- ☆11Dec 8, 2022Updated 3 years ago