LuminosityX/HAT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LuminosityX/HAT)

LuminosityX / HAT

Implementation of our paper, 'Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.'

☆27

Alternatives and similar repositories for HAT

Users that are interested in HAT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LuminosityX / FNE
View on GitHub
Implementation of our paper, Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination..
☆20Dec 3, 2023Updated 2 years ago
cwj1412 / MSCOCO-Flikcr30K_FG
View on GitHub
Benchmark data for "Rethinking Benchmarks for Cross-modal Image-text Retrieval" (SIGIR 2023)
☆28Apr 24, 2023Updated 3 years ago
nguyentthong / video-language-understanding
View on GitHub
[ACL’24 Findings] Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives
☆48May 12, 2026Updated 2 months ago
ZhangXu0963 / VSL
View on GitHub
The code of "Image-text Retrieval via Preserving Main Semantic of Vision" in ICME 2023.
☆15Dec 25, 2023Updated 2 years ago
CrossmodalGroup / ESL
View on GitHub
☆12May 3, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
taewhankim / VIPCAP
View on GitHub
☆15Dec 31, 2024Updated last year
xiaoyuan1996 / MCRN
View on GitHub
A multi-source cross-modal retrieval network
☆14Jan 8, 2024Updated 2 years ago
KevinLight831 / ESA
View on GitHub
[TCSVT2023] - ESA: External Space Attention Aggregation for Image-Text Retrieval
☆23Aug 30, 2024Updated last year
lerogo / aaai24_itr_cusa
View on GitHub
Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"
☆55Mar 28, 2024Updated 2 years ago
ppanzx / CHAN
View on GitHub
☆54Sep 13, 2023Updated 2 years ago
ZhangWeihang99 / HVSA
View on GitHub
Official PyTorch implementation for Hypersphere-Based Remote Sensing Cross-Modal Text–Image Retrieval via Curriculum Learning.
☆16Aug 10, 2024Updated last year
96-Zachary / vse_2ad
View on GitHub
☆15Apr 30, 2022Updated 4 years ago
Flame-Chasers / TBPS-CLIP
View on GitHub
【AAAI 2024】An Empirical Study of CLIP for Text-based Person Search
☆81Mar 20, 2026Updated 4 months ago
devaansh100 / CLIPTrans
View on GitHub
Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…
☆20Jun 3, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
BMC-SDNU / Cross-Modal-Retrieval
View on GitHub
Cross-Modal-Real-valuded-Retrieval
☆88Jul 18, 2023Updated 3 years ago
AAA-Zheng / Image-Text-Matching-Summary
View on GitHub
Summary of Related Research on Image-Text Matching
☆75May 20, 2023Updated 3 years ago
zhangy0822 / USER
View on GitHub
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024
☆33Jun 18, 2025Updated last year
CrossmodalGroup / HREM
View on GitHub
Learning Semantic Relationship among Instances for Image-Text Matching, CVPR, 2023
☆93Apr 21, 2025Updated last year
LCFractal / TGDT
View on GitHub
Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training
☆30Jun 20, 2023Updated 3 years ago
FutureTwT / HMAH
View on GitHub
The source code of "Teacher-Student Learning: Efficient Hierarchical Message Aggregation Hashing for Cross-Modal Retrieval." (Accepted by…
☆21Jun 7, 2022Updated 4 years ago
PaulLerner / ViQuAE
View on GitHub
Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…
☆39Dec 19, 2024Updated last year
GX77 / LCVSL
View on GitHub
☆14Sep 28, 2023Updated 2 years ago
XuMengyaAmy / SwinMLP_TranCAP
View on GitHub
☆13Jun 26, 2022Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Wusiwei0410 / SciMMIR
View on GitHub
☆25Aug 1, 2024Updated last year
LeiJiangJNU / R3FA
View on GitHub
3D Face Alignment ---The 10th International Conference on Image and Graphics(ICIG2019)-Oral
☆11Dec 3, 2019Updated 6 years ago
ghchen18 / acl23_mclip
View on GitHub
The official code and model for ACL 2023 paper 'mCLIP: Multilingual CLIP via Cross-lingual Transfer'
☆10Jan 23, 2024Updated 2 years ago
CrossmodalGroup / LAPS
View on GitHub
Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment, CVPR, 2024
☆110Jun 26, 2025Updated last year
duyngtr16061999 / KDMCSE
View on GitHub
☆10Apr 7, 2024Updated 2 years ago
GQBBBB / UCI
View on GitHub
☆10Oct 5, 2023Updated 2 years ago
hsiehjackson / Mr.Right
View on GitHub
Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text
☆24Aug 15, 2022Updated 3 years ago
anosorae / IRRA
View on GitHub
Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval (CVPR 2023)
☆285Mar 26, 2025Updated last year
Wangt-CN / Code_CASC
View on GitHub
☆14Oct 14, 2019Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jaychempan / PriorCLIP
View on GitHub
Official Code for “PriorCLIP: Visual Prior Guided Vision-Language Model for Remote Sensing Image-Text Retrieval”
☆30Dec 19, 2025Updated 7 months ago
woodfrog / vse_infty
View on GitHub
Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)
☆165Aug 24, 2025Updated 10 months ago
advanc3dUA / WohnungSuchen
View on GitHub
🏠🔍 Auto check for new apartments in Hamburg from various real estate provides
☆16Apr 15, 2026Updated 3 months ago
jicheol93 / PLOT
View on GitHub
☆13Feb 13, 2025Updated last year
mesnico / TERAN
View on GitHub
Code and Resources for the Transformer Encoder Reasoning and Alignment Network (TERAN), accepted for publication in ACM Transactions on M…
☆74Dec 6, 2023Updated 2 years ago
MCR-PEFT / Ex-MCR
View on GitHub
☆44May 20, 2025Updated last year
denfed / heartheflow
View on GitHub
Repository for the 2023 WACV paper: "Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source Localization"
☆12Dec 21, 2022Updated 3 years ago