ModalMinds/MM-PRM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ModalMinds/MM-PRM)

ModalMinds / MM-PRM

MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision

☆30

Alternatives and similar repositories for MM-PRM

Users that are interested in MM-PRM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tliby / UniFork
View on GitHub
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation
☆48Aug 26, 2025Updated 10 months ago
FanqingM / MM-Eureka-V0
View on GitHub
MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka
☆325Jun 21, 2025Updated last year
princetonvisualai / icons
View on GitHub
☆22Apr 24, 2025Updated last year
UCSC-VLAA / VLAA-Thinking
View on GitHub
[TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
☆148Oct 10, 2025Updated 9 months ago
ModalMinds / MM-EUREKA
View on GitHub
MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning
☆770Sep 7, 2025Updated 10 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
cmu-mind / RISE
View on GitHub
☆34Oct 31, 2024Updated last year
yuexy / ST-AR
View on GitHub
☆14Sep 22, 2025Updated 9 months ago
princeton-pli / what-makes-good-rm
View on GitHub
[NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective
☆44Sep 18, 2025Updated 10 months ago
jiaangli / VILA
View on GitHub
[TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study
☆16Nov 22, 2024Updated last year
coder-qicao / DreamPRM
View on GitHub
DreamPRM tackles the dataset quality imbalance and distribution shift that plague multimodal PRM training by domain-reweighting.
☆24Sep 6, 2025Updated 10 months ago
IntMeGroup / LMM4LMM
View on GitHub
[ICCV 2025 Highlight] LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
☆20Nov 16, 2025Updated 8 months ago
HHYHRHY / MM-ACT
View on GitHub
[CVPR'2026] "MM-ACT: Learn from Multimodal Parallel Generation to Act"
☆117Mar 13, 2026Updated 4 months ago
real-absolute-AI / NoisyRollout
View on GitHub
[NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆112Sep 18, 2025Updated 10 months ago
InternScience / TrustGeoGen
View on GitHub
Official repository for "TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving"
☆23Sep 1, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
wizard-III / Archer2.0
View on GitHub
Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature converg…
☆31Oct 10, 2025Updated 9 months ago
LINs-lab / IOMM
View on GitHub
[CVPR 2026] IOMM: Fast Pre-training of Unified Multimodal Models without Text-Image Pairs
☆26Apr 11, 2026Updated 3 months ago
sail-sg / ActivePRM
View on GitHub
☆21Apr 16, 2025Updated last year
cyzus / thoughtsculpt
View on GitHub
THOUGHTSCULPT, a general reasoning and search method for complex tasks
☆13Dec 13, 2024Updated last year
kxfan2002 / SophiaVL-R1
View on GitHub
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆94Aug 8, 2025Updated 11 months ago
si0wang / ThinkLite-VL
View on GitHub
☆105Jun 10, 2025Updated last year
hkust-nlp / model-task-align-rl
View on GitHub
[ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".
☆18Feb 9, 2026Updated 5 months ago
Chengsong-Huang / Self-Calibration
View on GitHub
codes for Efficient Test-Time Scaling via Self-Calibration
☆20Sep 13, 2025Updated 10 months ago
ModalMinds / gym-v
View on GitHub
A unified framework for vision-language environments with Gymnasium-compatible interface
☆35Mar 17, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ludc506 / InternVL-X
View on GitHub
☆16Mar 26, 2025Updated last year
Exgc / R1V-Free
View on GitHub
R1V, trained with AI feedback, answers open-ended visual questions.
☆14Apr 12, 2025Updated last year
Vinoground / Vinoground
View on GitHub
☆13Apr 13, 2026Updated 3 months ago
HHYHRHY / OWMM-Agent
View on GitHub
[NeurIPS'2025] "OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis"
☆30Dec 4, 2025Updated 7 months ago
RLHFlow / RAFT
View on GitHub
This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…
☆43Sep 22, 2024Updated last year
RUCBM / AtomMem
View on GitHub
☆27Mar 31, 2026Updated 3 months ago
SeanLeng1 / CrossWordBench
View on GitHub
☆12Apr 18, 2025Updated last year
evolvent-ai / ClawMark
View on GitHub
🦞 ClawMark: A Living-World Benchmark for Multi-Day, Multimodal Coworker Agents
☆116May 28, 2026Updated last month
MAmmoTH-VL / MAmmoTH-VL
View on GitHub
(ACL 2025) MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
☆50Jun 4, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
OpenRLHF / OpenRLHF-M
View on GitHub
An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.
☆163Apr 6, 2026Updated 3 months ago
TIGER-AI-Lab / PixelWorld
View on GitHub
The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]
☆15Sep 12, 2025Updated 10 months ago
ls-kelvin / REVPT
View on GitHub
Code for paper: Reinforced Vision Perception with Tools
☆74Oct 3, 2025Updated 9 months ago
ricefryegg / F1-Visa-Myths
View on GitHub
签证官揭开关于美国学生签证申请的谣言
☆11May 30, 2018Updated 8 years ago
michelecafagna26 / vinvl-visualbackbone
View on GitHub
Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.
☆12Nov 27, 2022Updated 3 years ago
chenllliang / G1
View on GitHub
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
☆103May 20, 2025Updated last year
CASIA-IVA-Lab / OPT_Questioner
View on GitHub
Official PyTorch implementation of the paper "Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner"
☆15Aug 9, 2023Updated 2 years ago