lerogo/aaai24_itr_cusa

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lerogo/aaai24_itr_cusa)

lerogo / aaai24_itr_cusa

Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"

☆55

Alternatives and similar repositories for aaai24_itr_cusa

Users that are interested in aaai24_itr_cusa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

vkhoi / cora_cvpr24
View on GitHub
☆28Sep 3, 2024Updated last year
Ji-Haoyang / FGVLA
View on GitHub
The code of Fine-Grained Visual-Language Alignment for Remote Sensing Image-Text Retrieval（IEEE Transactions on Geoscience and Remote Sen…
☆15Jun 30, 2025Updated last year
ppanzx / CHAN
View on GitHub
☆54Sep 13, 2023Updated 2 years ago
jaychempan / PriorCLIP
View on GitHub
Official Code for “PriorCLIP: Visual Prior Guided Vision-Language Model for Remote Sensing Image-Text Retrieval”
☆30Dec 19, 2025Updated 7 months ago
Paranioar / RCAR
View on GitHub
[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”
☆34Apr 11, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
vl2g / CSTBIR
View on GitHub
Official Code for Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex Interactions
☆15Dec 27, 2023Updated 2 years ago
CrossmodalGroup / HREM
View on GitHub
Learning Semantic Relationship among Instances for Image-Text Matching, CVPR, 2023
☆93Apr 21, 2025Updated last year
mesnico / ALADIN
View on GitHub
Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"
☆28Dec 6, 2023Updated 2 years ago
KevinLight831 / ESA
View on GitHub
[TCSVT2023] - ESA: External Space Attention Aggregation for Image-Text Retrieval
☆23Aug 30, 2024Updated last year
cyh-sj / CGMN
View on GitHub
The code of the paper "Cross-Modal Graph Matching Network for Image-Text Retrieval" in ACM Transactions on Multimedia Computing, Communic…
☆45Jun 5, 2023Updated 3 years ago
CFM-MSG / Code-AUL
View on GitHub
☆19Mar 5, 2024Updated 2 years ago
taewhankim / VIPCAP
View on GitHub
☆15Dec 31, 2024Updated last year
miccunifi / Cross-the-Gap
View on GitHub
[ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
☆70Nov 30, 2025Updated 7 months ago
bytedance / ParGo
View on GitHub
Official PyTorch Implementation of ParGo: Bridging Vision-Language with Partial and Global Views. (AAAI 2025)
☆16Jan 7, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
LuminosityX / FNE
View on GitHub
Implementation of our paper, Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination..
☆20Dec 3, 2023Updated 2 years ago
linhuixiao / HiVG
View on GitHub
[ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.
☆65Nov 10, 2025Updated 8 months ago
LunarShen / TempMe
View on GitHub
[ICLR 2025] TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
☆27Feb 13, 2025Updated last year
LuminosityX / HAT
View on GitHub
Implementation of our paper, 'Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.'
☆27Dec 3, 2023Updated 2 years ago
Noah888 / DAR
View on GitHub
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
☆15Oct 22, 2024Updated last year
96-Zachary / vse_2ad
View on GitHub
☆15Apr 30, 2022Updated 4 years ago
siyuancncd / FUME
View on GitHub
This is the official implementation of "Fuzzy Multimodal Learning for Trusted Cross-modal Retrieval" (CVPR 2025)
☆40Jul 6, 2026Updated 3 weeks ago
YCaigogogo / CODER
View on GitHub
☆22Apr 27, 2024Updated 2 years ago
QiQAng / UEDVC
View on GitHub
☆12May 26, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
hhc1997 / L2RM
View on GitHub
☆43Mar 28, 2024Updated 2 years ago
zchoi / GLSCL
View on GitHub
[TIP25] Code for "Text-Video Retrieval with Global-Local Semantic Consistent Learning"
☆16May 12, 2025Updated last year
anosorae / IRRA
View on GitHub
Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval (CVPR 2023)
☆286Mar 26, 2025Updated last year
XLearning-SCU / 2025-ICLR-TCR
View on GitHub
Pytorch implementation of "Test-time Adaptation for Cross-modal Retrieval with Query Shift".
☆35Nov 22, 2025Updated 8 months ago
hhc1997 / MSCN
View on GitHub
☆12Mar 28, 2024Updated 2 years ago
hrtang22 / MUSE
View on GitHub
Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"
☆26Feb 2, 2025Updated last year
ChiYeungLaw / LexLIP-ICCV23
View on GitHub
Official Code for the ICCV23 Paper: "LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval…
☆39Oct 14, 2023Updated 2 years ago
facebookresearch / SIEVE
View on GitHub
SIEVE: Multimodal Dataset Pruning using Image-Captioning Models (CVPR 2024)
☆21Apr 28, 2024Updated 2 years ago
suoych / KEDs
View on GitHub
Implementation of the paper Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval (CVPR 2024)
☆20Nov 4, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Jahawn-Wen / CAMeL-reID
View on GitHub
[IEEE Transactions on Information Forensics and Security'25] Pytorch implementation of CAMeL: Cross-modality Adaptive Meta-Learning for T…
☆17Jan 5, 2026Updated 6 months ago
alipay / PC2-NoiseofWeb
View on GitHub
Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text …
☆16Nov 20, 2025Updated 8 months ago
musicman217 / Text-Proxy
View on GitHub
Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval -- AAAI2025
☆21May 8, 2026Updated 2 months ago
linhuixiao / OneRef
View on GitHub
[NeurIPS 2024] OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling.
☆32Nov 13, 2025Updated 8 months ago
OmkarThawakar / composed-video-retrieval
View on GitHub
Composed Video Retrieval
☆62May 2, 2024Updated 2 years ago
xfactlab / I0T
View on GitHub
[ACL Main 2025] I0T: Embedding Standardization Method Towards Zero Modality Gap
☆12Jun 18, 2025Updated last year
multimodal-interpretability / nnn
View on GitHub
Nearest Neighbor Normalization (EMNLP 2024)
☆21Nov 1, 2024Updated last year