yfzhang114/r1_reward

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yfzhang114/r1_reward)

yfzhang114 / r1_reward

✨✨ [ICLR 2026] R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

☆291

Alternatives and similar repositories for r1_reward

Users that are interested in r1_reward are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Kwai-YuanQi / MM-RLHF
View on GitHub
The Next Step Forward in Multimodal LLM Alignment
☆199May 1, 2025Updated last year
yfzhang114 / Thyme
View on GitHub
✨✨ [ICLR 2026] Think Beyond Images
☆584Sep 23, 2025Updated 10 months ago
MME-Benchmarks / MME-Unify
View on GitHub
✨✨ [ICLR 2026] MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
☆43Apr 10, 2025Updated last year
VITA-MLLM / Sparrow
View on GitHub
Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation
☆32Mar 28, 2025Updated last year
Northern-byte-bit / SpeechParaling-Bench
View on GitHub
☆30May 21, 2026Updated 2 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
RM-R1-UIUC / RM-R1
View on GitHub
[ICLR'26] RM-R1: Unleashing the Reasoning Potential of Reward Models
☆167Jun 26, 2025Updated last year
PKU-Alignment / align-anything
View on GitHub
Align Anything: Training All-modality Model with Feedback
☆4,664Nov 27, 2025Updated 8 months ago
VITA-MLLM / Long-VITA
View on GitHub
✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
☆305May 14, 2025Updated last year
CodeGoat24 / UnifiedReward
View on GitHub
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
☆796Jun 18, 2026Updated last month
facebookresearch / multimodal_rewardbench
View on GitHub
Multimodal RewardBench
☆68Feb 21, 2025Updated last year
SkyworkAI / Skywork-R1V
View on GitHub
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.
☆3,159Dec 15, 2025Updated 7 months ago
yfzhang114 / SliME
View on GitHub
✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
☆163Dec 26, 2024Updated last year
MME-Benchmarks / MME-RealWorld
View on GitHub
✨✨ [ICLR 2025] MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
☆161Oct 21, 2025Updated 9 months ago
MAC-AutoML / QuoTA
View on GitHub
✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Vi…
☆79Apr 28, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
Hunyuan-PromptEnhancer / PromptEnhancer
View on GitHub
[CVPR 2026] PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image generation.
☆3,739Jun 10, 2026Updated last month
real-absolute-AI / NoisyRollout
View on GitHub
[NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆112Sep 18, 2025Updated 10 months ago
Kwai-YuanQi / TaskGalaxy
View on GitHub
Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
☆32Jul 16, 2025Updated last year
hyperai / tvm-cn
View on GitHub
TVM Documentation in Chinese Simplified / TVM 中文文档
☆3,868May 20, 2026Updated 2 months ago
limix-ldm-ai / LimiX
View on GitHub
LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence https://arxiv.org/abs/2509.03505
☆3,852Jun 16, 2026Updated last month
TideDra / lmm-r1
View on GitHub
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
☆848May 14, 2025Updated last year
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆190Jun 5, 2025Updated last year
yangruoliu / VideoDetective
View on GitHub
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
☆58May 1, 2026Updated 2 months ago
MiG-NJU / PersonaVLM
View on GitHub
[CVPR 2026 Highlight] PersonaVLM: Long-Term Personalized Multimodal LLMs
☆112Apr 16, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kxfan2002 / SophiaVL-R1
View on GitHub
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆94Aug 8, 2025Updated 11 months ago
Kwai-Keye / Keye
View on GitHub
☆808Jun 10, 2026Updated last month
minglllli / CLS-RL
View on GitHub
[NeurIPS 2025 Spotlight] Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning
☆90Sep 19, 2025Updated 10 months ago
OpenDCAI / DataFlow
View on GitHub
Easy Data Preparation with latest LLMs-based Operators and Pipelines.
☆7,070Jul 15, 2026Updated 2 weeks ago
vl-rewardbench / VL_RewardBench
View on GitHub
☆29Jul 23, 2025Updated last year
yunfeixie233 / ViGaL
View on GitHub
☆70Feb 4, 2026Updated 5 months ago
open-gigaai / giga-brain-0
View on GitHub
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
☆2,554Mar 10, 2026Updated 4 months ago
Kwai-Klear / AR-GRPO
View on GitHub
Training Autoregressive Image Generation models via Reinforcement Learning
☆53Nov 26, 2025Updated 8 months ago
Klavis-AI / klavis
View on GitHub
Klavis AI: MCP integration platforms that let AI agents use tools reliably at any scale
☆5,779Jun 1, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Kail-Fu / InterviewOS
View on GitHub
Replace coding puzzles with real-work simulations.
☆1,906Jul 10, 2026Updated 2 weeks ago
ZJU4HealthCare / Foundations-of-Medical-LLMs
View on GitHub
Foundations of Medical Large Language Model Learning
☆1,756May 27, 2026Updated 2 months ago
YOOTeam / OpenPPT
View on GitHub
AIPPT Online editor，Base On ChatPPT， supports document editing services throughout the entire process, including import, export, layout b…
☆1,089Sep 17, 2025Updated 10 months ago
VITA-MLLM / Omni-Diffusion
View on GitHub
✨✨[ICML 2026] Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
☆153Mar 12, 2026Updated 4 months ago
Lucas0623z / NoteLite
View on GitHub
☆856Jul 9, 2026Updated 2 weeks ago
Gen-Verse / HermesFlow
View on GitHub
[NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
☆77Sep 19, 2025Updated 10 months ago
HJYao00 / Mulberry
View on GitHub
[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS
☆1,244Jan 16, 2026Updated 6 months ago