✨✨ [ICLR 2026] R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
☆285May 9, 2025Updated 11 months ago
Alternatives and similar repositories for r1_reward
Users that are interested in r1_reward are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The Next Step Forward in Multimodal LLM Alignment☆201May 1, 2025Updated 11 months ago
- Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation☆32Mar 28, 2025Updated last year
- ✨✨ [ICLR 2026] Think Beyond Images☆578Sep 23, 2025Updated 6 months ago
- ✨✨ [ICLR 2026] MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models☆43Apr 10, 2025Updated last year
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆32Jul 16, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Vi…☆78Apr 28, 2025Updated 11 months ago
- ✨✨ [ICLR 2025] MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?☆155Oct 21, 2025Updated 5 months ago
- ✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy☆305May 14, 2025Updated 11 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆112Sep 18, 2025Updated 7 months ago
- ✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models☆164Dec 26, 2024Updated last year
- ☆47Apr 9, 2025Updated last year
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆161Jun 26, 2025Updated 9 months ago
- Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.☆3,162Dec 15, 2025Updated 4 months ago
- [NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆78Sep 19, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex☆765Mar 19, 2026Updated last month
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆842May 14, 2025Updated 11 months ago
- Multimodal RewardBench☆68Feb 21, 2025Updated last year
- VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding☆59Mar 24, 2026Updated 3 weeks ago
- ✨✨[NeurIPS 2025] VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model☆678May 24, 2025Updated 10 months ago
- Align Anything: Training All-modality Model with Feedback☆4,646Nov 27, 2025Updated 4 months ago
- ✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehensi…☆415Jan 14, 2026Updated 3 months ago
- [NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT☆431Sep 18, 2025Updated 7 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆437Mar 20, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆814Jun 9, 2025Updated 10 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆82Dec 25, 2025Updated 3 months ago
- Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’☆2,261Oct 29, 2025Updated 5 months ago
- Explore the Multimodal “Aha Moment” on 2B Model☆623Mar 18, 2025Updated last year
- [NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆1,243Jan 16, 2026Updated 3 months ago
- ☆68Feb 4, 2026Updated 2 months ago
- Solve Visual Understanding with Reinforced VLMs☆5,939Mar 12, 2026Updated last month
- A fork to add multimodal model training to open-r1☆1,528Feb 8, 2025Updated last year
- MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning☆772Sep 7, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [NeurIPS 2025] TTRL: Test-Time Reinforcement Learning☆1,048Updated this week
- repo for paper https://arxiv.org/abs/2504.13837☆336Dec 17, 2025Updated 4 months ago
- Witness the aha moment of VLM with less than $3.☆4,050May 19, 2025Updated 11 months ago
- [NeurIPS 2025 Spotlight] Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning☆84Sep 19, 2025Updated 7 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆93Aug 8, 2025Updated 8 months ago
- Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities☆1,175Jul 15, 2025Updated 9 months ago
- ☆271May 14, 2025Updated 11 months ago