kkzhang95/Awesome-Composed-Multi-modal-Retrieval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kkzhang95/Awesome-Composed-Multi-modal-Retrieval)

kkzhang95 / Awesome-Composed-Multi-modal-Retrieval

A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CVR), etc.

☆90

Alternatives and similar repositories for Awesome-Composed-Multi-modal-Retrieval

Users that are interested in Awesome-Composed-Multi-modal-Retrieval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

iLearn-Lab / TOIS25-Awesome-Composed-Image-Retrieval
View on GitHub
Collection of Composed Image Retrieval (CIR) papers.
☆361Jun 8, 2026Updated last month
Badgewho / WSISum
View on GitHub
Code for 《WSISum: WSI summarization via dual-level semantic reconstruction》
☆20Jan 9, 2026Updated 6 months ago
Pter61 / osrcir
View on GitHub
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]
☆72Jul 8, 2025Updated last year
miccunifi / CIRCO
View on GitHub
[ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset
☆87Aug 6, 2025Updated 11 months ago
icq-benchmark / icq-benchmark
View on GitHub
☆19Jul 28, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Luan-zb / SuRe-Transformer
View on GitHub
☆13Jun 24, 2025Updated last year
CrossmodalGroup / ER-SAN
View on GitHub
Implementation of our IJCAI2022 oral paper, ER-SAN: Enhanced-Adaptive Relation Self-Attention Network for Image Captioning.
☆25Aug 5, 2023Updated 2 years ago
OmkarThawakar / composed-video-retrieval
View on GitHub
Composed Video Retrieval
☆62May 2, 2024Updated 2 years ago
fuxianghuang1 / Multimodal-Composite-Editing-and-Retrieval
View on GitHub
Multimodal-Composite-Editing-and-Retrieval-update
☆35Oct 13, 2025Updated 9 months ago
CrossmodalGroup / ESL
View on GitHub
☆12May 3, 2024Updated 2 years ago
CrossmodalGroup / LAPS
View on GitHub
Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment, CVPR, 2024
☆110Jun 26, 2025Updated last year
CrossmodalGroup / NAAF
View on GitHub
Implementation of our CVPR2022 paper, Negative-Aware Attention Framework for Image-Text Matching.
☆119Jun 19, 2023Updated 3 years ago
mvrl / ConText-CIR
View on GitHub
[CVPR'25] ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval
☆16Jun 17, 2026Updated last month
LanCole / Awesome-Remote-Sensing-Cross-Modal-Image-Text-Retrieval
View on GitHub
A collection of papers, datasets, benchmarks, code, and model weights for Remote Sensing Cross-Modal Image-Text Retrieval (RSCMIT).
☆39Jul 6, 2026Updated 3 weeks ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
chunmeifeng / SPRC
View on GitHub
【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval
☆94Apr 16, 2024Updated 2 years ago
sung-yeon-kim / GENIUS-CVPR25
View on GitHub
Official Implementation of GENIUS: A Generative Framework for Universal Multimodal Search, CVPR 2025
☆55Aug 8, 2025Updated 11 months ago
hmchuong / CoLLM
View on GitHub
[CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval
☆28Mar 26, 2025Updated last year
navervision / lincir
View on GitHub
Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)
☆148Jan 5, 2026Updated 6 months ago
iLearn-Lab / SIGIR24-DQU-CIR
View on GitHub
[SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval
☆44Jul 14, 2024Updated 2 years ago
lucas-ventura / CoVR
View on GitHub
Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".
☆119Apr 21, 2026Updated 3 months ago
ExplainableML / EgoCVR
View on GitHub
[ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
☆41Apr 11, 2025Updated last year
ToniChopp / SimCroP
View on GitHub
The official implementation of "SimCroP: Radiograph Representation Learning with Similarity-driven Cross-granularity Pre-training" (MICCA…
☆17Jan 24, 2026Updated 6 months ago
Pter61 / context-i2w
View on GitHub
Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]
☆54May 27, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cross-modal-retrieval / cross-modal-retrieval
View on GitHub
☆34Oct 1, 2024Updated last year
kdwonn / DivE
View on GitHub
Repository of "Improving Cross-Modal Retrieval With Set of Diverse Embeddings" (CVPR'23, Highlight)
☆41Nov 15, 2023Updated 2 years ago
Code-kunkun / ZS-CIR
View on GitHub
[BMVC 2023] Zero-shot Composed Text-Image Retrieval
☆55Nov 26, 2024Updated last year
Paranioar / RCAR
View on GitHub
[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”
☆34Apr 11, 2024Updated 2 years ago
ExplainableML / Vision_by_Language
View on GitHub
[ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"
☆89Jul 4, 2024Updated 2 years ago
TangXu-Group / Cross-modal-remote-sensing-image-and-text-retrieval-models
View on GitHub
☆22Sep 19, 2024Updated last year
May2333 / FDCA
View on GitHub
[ICLR 2025] This repo is the official implementation of our paper "Learning Fine-Grained Representations through Textual Token Disentangl…
☆23Jul 28, 2025Updated last year
jameslahm / SCPNet
View on GitHub
Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels [CVPR 2023]
☆14Sep 23, 2023Updated 2 years ago
VectorSpaceLab / MegaPairs
View on GitHub
[ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval
☆248Nov 6, 2025Updated 8 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
hongwang600 / fashion-iq-metadata
View on GitHub
this repo contains some useful metadata for Fashion IQ challenge: https://sites.google.com/view/lingir/fashion-iq
☆15Jun 28, 2019Updated 7 years ago
Yarayx / livelongbench
View on GitHub
The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…
☆12Jun 28, 2025Updated last year
pengfei-luo / ImageScope
View on GitHub
[WWW 2025 Oral] ImageScope: Unifying Language-Guided Image Retrieval via Large Multimodal Model Collective Reasoning
☆21Jul 2, 2025Updated last year
Thorin215 / GRE
View on GitHub
☆18Sep 19, 2025Updated 10 months ago
CrossmodalGroup / CMCAN
View on GitHub
Implementation of our AAAI2022 paper, Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching.
☆36Jun 16, 2023Updated 3 years ago
Delong-liu-bupt / Composed_Person_Retrieval
View on GitHub
[NeurIPS 2025] Composed Person Retrieval (CPR) is a new cross-modal retrieval task that aims to identify individuals in large-scale perso…
☆76Jun 29, 2026Updated last month
Lee-zixu / FineCIR
View on GitHub
☆12Mar 31, 2025Updated last year