GaryGuTC/UniME-v2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/GaryGuTC/UniME-v2)

GaryGuTC / UniME-v2

[AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"

☆74

Alternatives and similar repositories for UniME-v2

Users that are interested in UniME-v2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

deepglint / UniME
View on GitHub
[ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"
☆105Dec 8, 2025Updated 7 months ago
deepglint / RealSyn
View on GitHub
[ACM MM2025] The official repository for the RealSyn dataset
☆39Dec 14, 2025Updated 7 months ago
XMUDeepLIT / UME-R1
View on GitHub
The code implementation for UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings (ICLR 2026).
☆70Feb 25, 2026Updated 5 months ago
anxiangsir / V-SWIFT
View on GitHub
V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day
☆30Feb 5, 2025Updated last year
xiaoxing2001 / DeGLA
View on GitHub
[ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]
☆16Jul 15, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
haoyu-bu / CAFe
View on GitHub
Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"
☆33Mar 26, 2025Updated last year
ZoengHN / Embed-RL
View on GitHub
☆45Jun 23, 2026Updated last month
deepglint / Victor
View on GitHub
ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs
☆29Aug 15, 2025Updated 11 months ago
EvolvingLMMs-Lab / LLaVA-OneVision-1.5-RL
View on GitHub
Fully Open Framework for Democratized Multimodal Reinforcement Learning.
☆51Dec 19, 2025Updated 7 months ago
deepglint / MLCD-Seg
View on GitHub
MLCD-Seg is a zero-shot segmentation model from DeepGlint.
☆18Jul 4, 2025Updated last year
Roytsai27 / GIRCSE
View on GitHub
Official implementation of ICLR 2026: Let LLMs Speak Embedding Languages: Generative Text Embeddings via Iterative Contrastive Refinement
☆15May 24, 2026Updated 2 months ago
deepglint / DanQing
View on GitHub
The official repo for the DanQing dataset.
☆36Mar 25, 2026Updated 4 months ago
WeChatCV / ObjEmbed
View on GitHub
(ICML 2026) Official repository of paper "ObjEmbed: Towards Universal Multimodal Object Embeddings"
☆51May 18, 2026Updated 2 months ago
GasolSun36 / GRACE
View on GitHub
[ICLR 2025] Official repo for paper: "GRACE: Generative Representation Learning via Contrastive Policy Optimization"
☆39Feb 3, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
TIGER-AI-Lab / VLM2Vec
View on GitHub
This repo contains the code for "VLM2Vec / MMEB" [ICLR 2025], "VLM2Vec-V2 / MMEB-V2" [TMLR 2026], and "MMEB-V3" [COLM 2026]
☆669Updated this week
Multimodal-Representation-Learning-MRL / GA-DMS
View on GitHub
[EMNLP25 Main]The official code of "Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval"
☆25Mar 30, 2026Updated 3 months ago
raghavlite / B3
View on GitHub
☆43Jan 12, 2026Updated 6 months ago
Z1zs / MMNeuron
View on GitHub
Official implementation of "MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model". Our co…
☆26Dec 20, 2024Updated last year
haoxiangzhao12138 / PLUME
View on GitHub
[ACMMM 2026] PLUME: Latent Reasoning Based Universal Multimodal Embedding
☆24Apr 29, 2026Updated 2 months ago
maifoundations / Visionary-R1
View on GitHub
Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning
☆44Jul 2, 2025Updated last year
360CVGroup / RzenEmbed
View on GitHub
Embedding model prioritized towards Multimodal RAG, overall + VisDoc double top1 on MMEB benchmark
☆36Jun 16, 2026Updated last month
XMUDeepLIT / LLaVE
View on GitHub
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning
☆78May 23, 2025Updated last year
friedrichor / UNITE
View on GitHub
official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"
☆42Jul 4, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Tencent-QQMM / QQMM-embed
View on GitHub
☆25Jun 22, 2026Updated last month
haon-chen / mmE5
View on GitHub
☆59Feb 27, 2025Updated last year
MCG-NJU / VideoEval
View on GitHub
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
☆15Jul 31, 2025Updated 11 months ago
haon-chen / MoCa
View on GitHub
☆68Aug 14, 2025Updated 11 months ago
deepglint / UniDoc-RL
View on GitHub
UniDoc-RL: Unified Document Understanding with Reinforcement Learning
☆16May 21, 2026Updated 2 months ago
MCG-NJU / RGE
View on GitHub
Reasoning Guided Embeddings: Leveraging MLLM Reasoning for Improved Multimodal Retrieval
☆15Nov 29, 2025Updated 7 months ago
chendy25 / V-Retrver
View on GitHub
☆36May 27, 2026Updated last month
EvolvingLMMs-Lab / LLaVA-OneVision-2
View on GitHub
Fully Open Framework for Democratized Multimodal Training
☆1,150Updated this week
kongds / E5-V
View on GitHub
E5-V: Universal Embeddings with Multimodal Large Language Models
☆275Dec 10, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
OpenGVLab / LCL
View on GitHub
[NeurIPS 2024] Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
☆72Feb 11, 2025Updated last year
EvolvingLMMs-Lab / OpenMMReasoner
View on GitHub
[CVPR 2026] OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
☆164Mar 30, 2026Updated 3 months ago
deepglint / RWKV-CLIP
View on GitHub
[EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner
☆151Dec 14, 2025Updated 7 months ago
LunarShen / DsicoVLA
View on GitHub
[CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval
☆22Jun 23, 2025Updated last year
dhk1349 / MERLIN_text_to_video_search
View on GitHub
[EMNLP 2024 Industry track] MERLIN : Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank P…
☆14Mar 4, 2025Updated last year
Luodian / nano-hevc
View on GitHub
A minimal, educational HEVC (H.265) encoder written in Python.
☆53Feb 23, 2026Updated 5 months ago
i2vec / MM-R5
View on GitHub
The official repository of MM-R5
☆30Jun 22, 2025Updated last year