LengSicong/MMR1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LengSicong/MMR1)

LengSicong / MMR1

[CVPR 2026] MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

☆217

Alternatives and similar repositories for MMR1

Users that are interested in MMR1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DAMO-NLP-SG / multimodal_textbook
View on GitHub
[ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
☆196Mar 17, 2025Updated last year
FanqingM / MM-Eureka-V0
View on GitHub
MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka
☆325Jun 21, 2025Updated last year
yuyq96 / R1-Vision
View on GitHub
R1-Vision: Let's first take a look at the image
☆48Feb 16, 2025Updated last year
real-absolute-AI / NoisyRollout
View on GitHub
[NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆112Sep 18, 2025Updated 10 months ago
ModalMinds / MM-EUREKA
View on GitHub
MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning
☆770Sep 7, 2025Updated 10 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Wang-Xiaodong1899 / Open-R1-Video
View on GitHub
✨First Open-Source R1-like Video-LLM [2025/02/18]
☆382Jul 1, 2026Updated 3 weeks ago
TideDra / lmm-r1
View on GitHub
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
☆848May 14, 2025Updated last year
alibaba-damo-academy / VL-Cogito
View on GitHub
☆24Nov 4, 2025Updated 8 months ago
appletea233 / Temporal-R1
View on GitHub
Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency
☆62Jun 6, 2025Updated last year
OpenRLHF / OpenRLHF-M
View on GitHub
An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.
☆163Apr 6, 2026Updated 3 months ago
DAMO-NLP-SG / CMM
View on GitHub
✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
☆54Jul 11, 2025Updated last year
turningpoint-ai / VisualThinker-R1-Zero
View on GitHub
Explore the Multimodal “Aha Moment” on 2B Model
☆624Mar 18, 2025Updated last year
kxfan2002 / SophiaVL-R1
View on GitHub
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆94Aug 8, 2025Updated 11 months ago
Osilly / Vision-R1
View on GitHub
[ICLR2026] This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that…
☆1,584Mar 20, 2026Updated 4 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Fancy-MLLM / R1-Onevision
View on GitHub
R1-onevision, a visual language model capable of deep CoT reasoning.
☆581Apr 13, 2025Updated last year
EvolvingLMMs-Lab / open-r1-multimodal
View on GitHub
A fork to add multimodal model training to open-r1
☆1,593Feb 8, 2025Updated last year
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,081Updated this week
LCO-Embedding / LCO-Embedding
View on GitHub
[NeurIPS 2025] Scaling Language-centric Omnimodal Representation Learning
☆48Apr 13, 2026Updated 3 months ago
dongyh20 / Insight-V
View on GitHub
[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
☆240Nov 7, 2025Updated 8 months ago
LaVi-Lab / Visual-Table
View on GitHub
[EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"
☆20Oct 17, 2024Updated last year
alibaba-damo-academy / EOCBench
View on GitHub
[NeurIPS 2025] EOC-Bench, an innovative benchmark designed to systematically evaluate object-centric embodied cognition in dynamic egocen…
☆22Jun 17, 2025Updated last year
CircleRadon / TokenPacker
View on GitHub
The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025
☆279May 26, 2025Updated last year
NVlabs / Long-RL
View on GitHub
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
☆727Sep 24, 2025Updated 10 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs
View on GitHub
This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…
☆1,435May 11, 2026Updated 2 months ago
jxjessieli / contextual-distortion-parser
View on GitHub
[ACL 2023] Contextual Distortion Reveals Constituency: Mask Language Models are Implicit Parsers.
☆14Jun 3, 2023Updated 3 years ago
DAMO-NLP-SG / MT-LLaMA
View on GitHub
Multi-Task instruction-tuned LLaMA
☆14May 5, 2023Updated 3 years ago
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆190Jun 5, 2025Updated last year
DAMO-NLP-SG / VideoLLaMA3
View on GitHub
Frontier Multimodal Foundation Models for Image and Video Understanding
☆1,172Aug 14, 2025Updated 11 months ago
DAMO-NLP-SG / SSTuning
View on GitHub
Code for ACL paper "Zero-Shot Text Classification via Self-Supervised Tuning"
☆29Sep 25, 2023Updated 2 years ago
ltzheng / SimpleTIR
View on GitHub
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆401Mar 30, 2026Updated 3 months ago
InternLM / OREAL
View on GitHub
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
☆190Mar 20, 2025Updated last year
MiroMindAI / MiroMind-M1
View on GitHub
MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.
☆279Aug 12, 2025Updated 11 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
DAMO-NLP-SG / VideoLLaMA2
View on GitHub
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
☆1,304Jan 23, 2025Updated last year
alibaba-damo-academy / MedEvalKit
View on GitHub
MedEvalKit: A Unified Medical Evaluation Framework
☆247Feb 24, 2026Updated 5 months ago
StarsfieldAI / R1-V
View on GitHub
Witness the aha moment of VLM with less than $3.
☆4,065May 19, 2025Updated last year
Visual-Agent / DeepEyes
View on GitHub
☆1,251Nov 20, 2025Updated 8 months ago
Liuziyu77 / Visual-RFT
View on GitHub
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
☆2,263Oct 29, 2025Updated 8 months ago
DAMO-NLP-SG / VCD
View on GitHub
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
☆410Oct 7, 2024Updated last year
Wild-Cooperation-Hub / Awesome-MLLM-Reasoning-Benchmarks
View on GitHub
A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.
☆76Mar 18, 2025Updated last year