i2vec/MM-R5

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/i2vec/MM-R5)

i2vec / MM-R5

The official repository of MM-R5

☆30

Alternatives and similar repositories for MM-R5

Users that are interested in MM-R5 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lezhang7 / Rearank
View on GitHub
[EMNLP 2025] Official codebase for Rearank: Reasoning Re-ranking Agent
☆40Aug 20, 2025Updated 11 months ago
DataArcTech / RagVL
View on GitHub
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆92Nov 15, 2024Updated last year
chunbolang / R2Net
View on GitHub
Official PyTorch Implementation of Global Rectification and Decoupled Registration for Few-Shot Segmentation in Remote Sensing Imagery (T…
☆18Nov 22, 2023Updated 2 years ago
Picsart-AI-Research / Mask-Matching-Transformer
View on GitHub
☆15Jan 12, 2023Updated 3 years ago
syr-cn / MemOCR
View on GitHub
☆16Mar 9, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
XMUDeepLIT / UME-R1
View on GitHub
The code implementation for UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings (ICLR 2026).
☆70Feb 25, 2026Updated 5 months ago
Jord8061 / logicPoison
View on GitHub
[ACL'26 Main Oral] Official code for "LogicPoison: Logical Attacks on Graph Retrieval-Augmented Generation".
☆35Jun 27, 2026Updated 3 weeks ago
GaryGuTC / UniME-v2
View on GitHub
[AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"
☆74Dec 8, 2025Updated 7 months ago
XMUDeepLIT / LLaVE
View on GitHub
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning
☆78May 23, 2025Updated last year
r2llab / GTTA
View on GitHub
This codebase is to reproduce the results of the paper "Grounded Test-Time Adaptation for LLM Agents".
☆17Mar 4, 2026Updated 4 months ago
xianzhangzx / FINER-MLLM
View on GitHub
The implementation of FINER-MLLM, which is accepted by MM2024.
☆18Oct 8, 2024Updated last year
haoyu-bu / CAFe
View on GitHub
Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"
☆33Mar 26, 2025Updated last year
xiaoqian-shen / Vgent
View on GitHub
[NeurIPS 2025 Spotlight] Official PyTorch implementation of Vgent
☆49Nov 30, 2025Updated 7 months ago
Aldrich2y / MIANet
View on GitHub
Official PyTorch Implementation of MIANet: Aggregating Unbiased Instance and General Information for Few-Shot Semantic Segmentation(CVPR …
☆30Mar 15, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
LAMDA-CL / ICCV2025-TUNA
View on GitHub
Integrating Task-Specific and Universal Adapters for Pre-Trained Model-based Class-Incremental Learning (ICCV 2025)
☆18Sep 23, 2025Updated 10 months ago
HanboBizl / DMNet
View on GitHub
DMNet for Few-shot Segmentation
☆33Nov 10, 2023Updated 2 years ago
vec-ai / wikiHow-TIIR
View on GitHub
[ACL 2025] Towards Text-Image Interleaved Retrieval
☆16Sep 3, 2025Updated 10 months ago
icq-benchmark / icq-benchmark
View on GitHub
☆19Jul 28, 2025Updated 11 months ago
Chenwei-Huang / DPH-Net
View on GitHub
Demo code of the paper "Deep Image Registration With Depth-Aware Homography Estimation"
☆10Feb 18, 2023Updated 3 years ago
TrinitialChan / DifFSS
View on GitHub
Official Implementation of the paper "DifFSS: Diffusion Model for Few-Shot Semantic Segmentation"
☆14Jul 26, 2023Updated 3 years ago
BatsResearch / efsl
View on GitHub
Extended Few-Shot Learning: Exploiting Existing Resources for Novel Tasks
☆10Jul 6, 2021Updated 5 years ago
Angknpng / UniSOD
View on GitHub
Unified-modal Salient Object Detection via Adaptive Prompt Learning
☆12Jul 17, 2026Updated last week
MrZilinXiao / AutoVER
View on GitHub
[ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.
☆14Mar 2, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
cqu-student / Wiki-PRF
View on GitHub
☆19Mar 9, 2026Updated 4 months ago
LoieSun / Auto-ACD
View on GitHub
code for A Large-scale Dataset for Audio-Language Representation Learning
☆14Sep 18, 2024Updated last year
riedlerm / multimodal_rag_for_industry
View on GitHub
Implementation and evaluation of multimodal RAG with text and image inputs for industrial applications
☆72Nov 6, 2024Updated last year
weijingxuan / COCO-MMR
View on GitHub
☆11Sep 27, 2023Updated 2 years ago
bodymovin / lottie-format-converter
View on GitHub
lottie script to convert json files to the new lottie format
☆16Feb 20, 2019Updated 7 years ago
yansheng-qiu / AI_Idea_Bench_2025
View on GitHub
☆16May 15, 2025Updated last year
deepglint / RealSyn
View on GitHub
[ACM MM2025] The official repository for the RealSyn dataset
☆39Dec 14, 2025Updated 7 months ago
Multimodal-Representation-Learning-MRL / GA-DMS
View on GitHub
[EMNLP25 Main]The official code of "Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval"
☆25Mar 30, 2026Updated 3 months ago
BriansIDP / AudioVisualLLM
View on GitHub
☆19May 19, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
JasonForJoy / BRIEF
View on GitHub
ACL 2026 & NAACL 2025: Bridging Retrieval and Inference through Evidence Fusion
☆14Apr 9, 2026Updated 3 months ago
pengts / VW-LMM
View on GitHub
☆25May 13, 2024Updated 2 years ago
RUCBM / GUICourse
View on GitHub
GUICourse: From General Vision Langauge Models to Versatile GUI Agents
☆143Mar 1, 2026Updated 4 months ago
Code-kunkun / LamRA
View on GitHub
[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
☆182Jul 7, 2025Updated last year
dhg-wei / MCL
View on GitHub
(ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning
☆28Sep 27, 2024Updated last year
Caoyichao / UniHOI
View on GitHub
Code for the paper "Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundatio…
☆28Nov 8, 2023Updated 2 years ago
TIGER-AI-Lab / VLM2Vec
View on GitHub
This repo contains the code for "VLM2Vec / MMEB" [ICLR 2025], "VLM2Vec-V2 / MMEB-V2" [TMLR 2026], and "MMEB-V3" [COLM 2026]
☆669Updated this week