friedrichor/UNITE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/friedrichor/UNITE)

friedrichor / UNITE

official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"

☆42

Alternatives and similar repositories for UNITE

Users that are interested in UNITE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

XMUDeepLIT / LLaVE
View on GitHub
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning
☆78May 23, 2025Updated last year
Code-kunkun / LamRA
View on GitHub
[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
☆182Jul 7, 2025Updated last year
TIGER-AI-Lab / VLM2Vec
View on GitHub
This repo contains the code for "VLM2Vec / MMEB" [ICLR 2025], "VLM2Vec-V2 / MMEB-V2" [TMLR 2026], and "MMEB-V3" [COLM 2026]
☆669Updated this week
LunarShen / DsicoVLA
View on GitHub
[CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval
☆22Jun 23, 2025Updated last year
kongds / E5-V
View on GitHub
E5-V: Universal Embeddings with Multimodal Large Language Models
☆275Dec 10, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
raghavlite / B3
View on GitHub
☆43Jan 12, 2026Updated 6 months ago
haoyu-bu / CAFe
View on GitHub
Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"
☆33Mar 26, 2025Updated last year
marinero4972 / CyberV
View on GitHub
☆20Jun 10, 2025Updated last year
GaryGuTC / UniME-v2
View on GitHub
[AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"
☆74Dec 8, 2025Updated 7 months ago
leolee99 / CLIP_ITM
View on GitHub
A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.
☆19May 25, 2023Updated 3 years ago
Ranking-VMR / SPR
View on GitHub
☆13Jun 11, 2026Updated last month
VectorSpaceLab / MegaPairs
View on GitHub
[ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval
☆248Nov 6, 2025Updated 8 months ago
YuxiXie / V-DPO
View on GitHub
Preference Learning for LLaVA
☆60Nov 9, 2024Updated last year
WangFei-2019 / SNARE
View on GitHub
Project for SNARE benchmark
☆11Jun 5, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
LuminosityX / FNE
View on GitHub
Implementation of our paper, Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination..
☆20Dec 3, 2023Updated 2 years ago
MCG-NJU / RGE
View on GitHub
Reasoning Guided Embeddings: Leveraging MLLM Reasoning for Improved Multimodal Retrieval
☆15Nov 29, 2025Updated 7 months ago
lscpku / VITATECS
View on GitHub
☆18Jul 10, 2024Updated 2 years ago
vec-ai / wikiHow-TIIR
View on GitHub
[ACL 2025] Towards Text-Image Interleaved Retrieval
☆16Sep 3, 2025Updated 10 months ago
mlvlab / BLiM
View on GitHub
Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)
☆26Aug 1, 2025Updated 11 months ago
iLearn-Lab / ACM-MM25-PUMA
View on GitHub
[ACM MM 2025] PUMA: Layer-Pruned Language Model for Efficient Unified Multimodal Retrieval with Modality-Adaptive Learning
☆18Jun 6, 2026Updated last month
haon-chen / MoCa
View on GitHub
☆68Aug 14, 2025Updated 11 months ago
MCG-NJU / VideoEval
View on GitHub
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
☆15Jul 31, 2025Updated 11 months ago
CrossmodalGroup / LAPS
View on GitHub
Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment, CVPR, 2024
☆110Jun 26, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
XMUDeepLIT / UME-R1
View on GitHub
The code implementation for UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings (ICLR 2026).
☆70Feb 25, 2026Updated 5 months ago
microsoft / A-CLIP
View on GitHub
Official Implementation of Attentive Mask CLIP (ICCV2023, https://arxiv.org/abs/2212.08653)
☆37May 29, 2024Updated 2 years ago
TechNomad-ds / LoVR-benchmark
View on GitHub
[WWW'2026 Oral] LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
☆16May 19, 2026Updated 2 months ago
OpenMatch / MARVEL
View on GitHub
[ACL 2024 Oral] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Mo…
☆39Jun 30, 2024Updated 2 years ago
haon-chen / mmE5
View on GitHub
☆59Feb 27, 2025Updated last year
BIGBALLON / BeyondCLIP
View on GitHub
Not a neutral survey — a field manual for engineers who build, train, and ship multimodal retrieval at production scale. The C-L-I triang…
☆81Apr 20, 2026Updated 3 months ago
cwj1412 / MSCOCO-Flikcr30K_FG
View on GitHub
Benchmark data for "Rethinking Benchmarks for Cross-modal Image-text Retrieval" (SIGIR 2023)
☆28Apr 24, 2023Updated 3 years ago
RitaRamo / extra
View on GitHub
Retrieval-augmented Image Captioning
☆13Feb 16, 2023Updated 3 years ago
megvii-research / protoclip
View on GitHub
📍 Official repository of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS 2023)
☆56Nov 8, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
taewhankim / VIPCAP
View on GitHub
☆15Dec 31, 2024Updated last year
LisaAnne / TemporalLanguageRelease
View on GitHub
☆44Mar 8, 2021Updated 5 years ago
lucasjinreal / wnnx_models
View on GitHub
Various test models in WNNX format. It can view with `pip install wnetron && wnetron`
☆12Jun 22, 2022Updated 4 years ago
MCG-NJU / CaReBench
View on GitHub
A Fine-grained Benchmark for Video Captioning and Retrieval
☆30Jul 16, 2025Updated last year
youonly-once / HNH
View on GitHub
High-order nonlocal Hashing for unsupervised cross-modal retrieval
☆14Nov 11, 2023Updated 2 years ago
Jam1ezhang / RankCLIP
View on GitHub
Ranking-Consistent Language-Image Pretraining
☆15Oct 24, 2025Updated 9 months ago
StanfordMIMI / villa
View on GitHub
[ICCV 2023] ViLLA: Fine-grained vision-language representation learning from real-world data
☆45Oct 15, 2023Updated 2 years ago