cwj1412 / MSCOCO-Flikcr30K_FG
Benchmark data for "Rethinking Benchmarks for Cross-modal Image-text Retrieval" (SIGIR 2023)
☆26Updated last year
Alternatives and similar repositories for MSCOCO-Flikcr30K_FG:
Users that are interested in MSCOCO-Flikcr30K_FG are comparing it to the libraries listed below
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆74Updated 9 months ago
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆45Updated 2 months ago
- USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024☆28Updated 10 months ago
- [CVPR' 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆45Updated 6 months ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆24Updated 2 months ago
- Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"☆74Updated last year
- [TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”☆29Updated 9 months ago
- ☆28Updated last year
- [ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives☆25Updated 3 months ago
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆51Updated 2 months ago
- ☆34Updated 2 years ago
- Composed Video Retrieval☆49Updated 8 months ago
- (ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning☆25Updated 4 months ago
- [CVPR 2023] VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval☆38Updated last year
- Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. CVPR 2023☆58Updated 3 months ago
- ☆29Updated 10 months ago
- ☆61Updated last year
- Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models☆46Updated last year
- NegCLIP.☆30Updated last year
- 📍 Official pytorch implementation of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS)☆52Updated last year
- ☆43Updated last year
- ☆24Updated last year
- A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval☆42Updated 2 years ago
- ☆89Updated last year
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆72Updated last year
- ☆34Updated last year
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆43Updated 6 months ago
- [ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"☆66Updated 3 years ago
- The official implementation for BLIP4CIR with bi-directional training | Bi-directional Training for Composed Image Retrieval via Text Pro…☆28Updated 11 months ago
- ☆23Updated 4 months ago