XMUDeepLIT/UME-R1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/XMUDeepLIT/UME-R1)

XMUDeepLIT / UME-R1

The code implementation for UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings (ICLR 2026).

☆39Updated this week

Alternatives and similar repositories for UME-R1

Users that are interested in UME-R1 are comparing it to the libraries listed below

Sorting:

XMUDeepLIT / LLaVE
View on GitHub
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning
☆77May 23, 2025Updated 9 months ago
joslefaure / HERMES
View on GitHub
[ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics
☆38Sep 10, 2025Updated 5 months ago
marinero4972 / CyberV
View on GitHub
☆18Jun 10, 2025Updated 8 months ago
iihcy / Credit_ACard
View on GitHub
信贷申请评分
☆10Apr 1, 2020Updated 5 years ago
NathanielFelleke / TinyML-Fall-Detection
View on GitHub
☆11Jul 2, 2021Updated 4 years ago
nguyenthaibinh / stam
View on GitHub
☆12Sep 25, 2023Updated 2 years ago
deepglint / UniME
View on GitHub
[ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"
☆103Dec 8, 2025Updated 2 months ago
Lilidamowang / T2VIndexer-generativeSearch
View on GitHub
☆13Aug 28, 2024Updated last year
hadaev8 / physionet_2017_rcrnn
View on GitHub
Solving physionet2017 with RCRNN
☆10Jun 11, 2019Updated 6 years ago
jirispilka / CTGViewer
View on GitHub
The CTGViewer: display cardiotocography (CTG) records -- fetal heart rate and uterine contractions.
☆10Feb 22, 2019Updated 7 years ago
TengShi-RUC / CFRAG
View on GitHub
☆16Oct 9, 2024Updated last year
flugel-it / k8s-python-operator
View on GitHub
Kubernetes operator example in Python3
☆13Mar 21, 2019Updated 6 years ago
lishuhuakai / quagga_reading
View on GitHub
quagga
☆10Apr 7, 2020Updated 5 years ago
hyc2026 / M3-Agent-Training
View on GitHub
☆28Jan 5, 2026Updated last month
yimeng-fan / SpikSSD
View on GitHub
☆16Jul 17, 2025Updated 7 months ago
mlvlab / BLiM
View on GitHub
Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)
☆20Aug 1, 2025Updated 7 months ago
caskcsg / longcontext
View on GitHub
Long Context Research
☆26Jan 26, 2026Updated last month
KejiaZhang-Robust / TARS
View on GitHub
TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs
☆23Sep 21, 2025Updated 5 months ago
fansunqi / AKeyS
View on GitHub
Agentic Keyframe Search for Video Question Answering
☆16Apr 7, 2025Updated 10 months ago
wangyushuai / k8s-deploy-yaml
View on GitHub
常用开源软件（Jaeger,grafana,consul,prometheus,nginx-ingress-controller）及常用资源（deployment,svc,ingress...） K8s部署Yaml合集
☆12Jun 27, 2020Updated 5 years ago
Burf / SwinTransformer-Tensorflow2
View on GitHub
SwinTransformer for Tensorflow2
☆12Jul 7, 2022Updated 3 years ago
jmnian / WRAG
View on GitHub
Code for paper "W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering"
☆15Oct 2, 2025Updated 4 months ago
WangFei-2019 / SNARE
View on GitHub
Project for SNARE benchmark
☆11Jun 5, 2024Updated last year
BoguslawObara / vesselness2d
View on GitHub
2d multiscale vessel enhancement filtering
☆10May 14, 2023Updated 2 years ago
qirui-chen / RGA3-release
View on GitHub
[ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring
☆24Aug 8, 2025Updated 6 months ago
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
dhk1349 / MERLIN_text_to_video_search
View on GitHub
[EMNLP 2024 Industry track] MERLIN : Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank P…
☆14Mar 4, 2025Updated 11 months ago
CnFaker / LLaVA-SP
View on GitHub
[ICCV 2025] The official pytorch implement of "LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs".
☆22Oct 28, 2025Updated 4 months ago
ChangyaoTian / ADDP
View on GitHub
The official implementation of ADDP (ICLR 2024)
☆12Mar 27, 2024Updated last year
hysts / CogView2_demo
View on GitHub
Unofficial demo app for CogView2
☆15Jul 30, 2022Updated 3 years ago
traveler-framework / TraveLER
View on GitHub
[EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering
☆16Oct 31, 2024Updated last year
Jayce1kk / SpaceVLLM
View on GitHub
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
☆16May 8, 2025Updated 9 months ago
deepglint / DanQing
View on GitHub
The official repo for the DanQing dataset.
☆29Jan 16, 2026Updated last month
levi-katarok / simplified-rag
View on GitHub
Simplifying RAG with PostgreSQL and PGVector
☆16Jul 31, 2024Updated last year
intelligolabs / CoIN
View on GitHub
[ICCV 25] Official repository of "Collaborative Instance Object Navigation: Leveraging Uncertainty-Awareness to Minimize Human-Agent Dial…
☆25Dec 6, 2025Updated 2 months ago
lmotte / graph-prediction-with-fused-gromov-wasserstein
View on GitHub
Python implementation of the supervised graph prediction method proposed in http://arxiv.org/abs/2202.03813 using PyTorch library and POT…
☆15Feb 25, 2022Updated 4 years ago
karin0 / jumpserver-proxy
View on GitHub
在本地愉快写 BUAA OS Lab，并直接在本地使用 git 提交。
☆10Jun 2, 2021Updated 4 years ago
engindeniz / vitis
View on GitHub
[ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
☆14Jan 13, 2025Updated last year
Ming-er / LGC-SED
View on GitHub
☆13Jan 3, 2024Updated 2 years ago