weijiawu / Awesome-RL-for-Multimodal-Foundation-ModelsLinks

📖 This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.

☆406

Alternatives and similar repositories for Awesome-RL-for-Multimodal-Foundation-Models

Users that are interested in Awesome-RL-for-Multimodal-Foundation-Models are comparing it to the libraries listed below

Sorting:

mll-lab-nu / Awesome-Spatial-Intelligence-in-VLM
A paper list for spatial reasoning
☆631Updated 2 weeks ago
cambrian-mllm / cambrian-s
Cambrian-S: Towards Spatial Supersensing in Video
☆482Updated last month
PzySeere / MetaSpatial
MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …
☆203Updated 8 months ago
vision-x-nyu / thinking-in-space
Official repo and evaluation implementation of VSI-Bench
☆668Updated 5 months ago
UMass-Embodied-AGI / Mirage
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)
☆238Updated 6 months ago
aim-uofa / Omni-R1
[NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
☆113Updated 2 months ago
mit-han-lab / vila-u
[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
☆417Updated 9 months ago
TIGER-AI-Lab / Pixel-Reasoner
Pixel-Level Reasoning Model trained with RL [NeuIPS25]
☆269Updated 2 months ago
yix8 / VisualPlanning
Visual Planning: Let's Think Only with Images
☆294Updated 8 months ago
tulerfeng / Awesome-Embodied-Multimodal-LLMs
Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).
☆121Updated last year
rongyaofang / GoT
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
☆305Updated 4 months ago
tanhuajie / Reason-RFT
[NeurIPS 2025]⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.
☆267Updated 4 months ago
diankun-wu / Spatial-MLLM
Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
☆426Updated 2 weeks ago
Gabesarch / grounded-rl
☆116Updated 6 months ago
OuyangKun10 / SpaceR
SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning
☆103Updated 6 months ago
multimodal-reasoning-lab / Bagel-Zebra-CoT
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
☆117Updated this week
EvolvingLMMs-Lab / EASI
Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
☆77Updated 2 weeks ago
Mini-o3 / Mini-o3
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
☆395Updated last week
Purshow / Awesome-Unified-Multimodal
📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.
☆348Updated 3 weeks ago
ML-GSAI / LLaDA-V
☆311Updated last month
InternLM / Spatial-SSRL
Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"
☆108Updated last month
EvolvingLMMs-Lab / EgoLife
[CVPR 2025] EgoLife: Towards Egocentric Life Assistant
☆382Updated 10 months ago
ziqihuangg / Awesome-From-Video-Generation-to-World-Model
A list of works on video generation towards world model
☆334Updated this week
JackYFL / awesome-VLLMs
This repository collects papers on VLLM applications. We will update new papers irregularly.
☆201Updated last month
JIA-Lab-research / VisionReasoner
Vision Manus: Your versatile Visual AI assistant
☆317Updated last week
SihanXU / nepa
PyTorch implementation of NEPA
☆303Updated last week
Video-Reason / Awesome-Video-Reasoning
This is a collection of recent papers on reasoning in video generation models.
☆95Updated 3 weeks ago
aim-uofa / Active-o3
ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
☆77Updated 2 months ago
NVlabs / Long-RL
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
☆690Updated 4 months ago
facebookresearch / metamorph
Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning
☆232Updated last week