[NeurIPS 2025 Spotlight] Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning
☆80Sep 19, 2025Updated 5 months ago
Alternatives and similar repositories for CLS-RL
Users that are interested in CLS-RL are comparing it to the libraries listed below
Sorting:
- [EMNLP 2024] Implementation of vision-language model fine-tuning via simple parameter-efficient modification☆18Nov 24, 2024Updated last year
- [Blog 1] Recording a bug of grpo_trainer in some R1 projects☆22Feb 23, 2025Updated last year
- [ICLR2026] This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that…☆773Jan 26, 2026Updated last month
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation☆39Jul 22, 2025Updated 7 months ago
- ☆20Oct 10, 2025Updated 4 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆105Sep 18, 2025Updated 5 months ago
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆841May 14, 2025Updated 9 months ago
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆32Mar 26, 2025Updated 11 months ago
- Encourage Medical LLM to engage in deep thinking similar to DeepSeek-R1.☆26Apr 24, 2025Updated 10 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆182Jun 5, 2025Updated 8 months ago
- Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency☆60Jun 6, 2025Updated 8 months ago
- Explore the Multimodal “Aha Moment” on 2B Model☆623Mar 18, 2025Updated 11 months ago
- code for EMNLP2018 paper 'Associative-multichannel-autoencoder for multimodal word representation'☆13Aug 24, 2018Updated 7 years ago
- [ICLR 2025 Spotlight] Realistic Evaluation of Deep Partial-Label Learning Algorithms☆14Feb 2, 2025Updated last year
- The official implement of "Grounded Chain-of-Thought for Multimodal Large Language Models"☆21Jul 21, 2025Updated 7 months ago
- [NeurIPS 2025] Official code implementation of Perception R1: Pioneering Perception Policy with Reinforcement Learning☆286Jul 15, 2025Updated 7 months ago
- ☆24Jun 18, 2025Updated 8 months ago
- ☆18Apr 20, 2025Updated 10 months ago
- ☆107Jun 10, 2025Updated 8 months ago
- TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning☆114Dec 24, 2025Updated 2 months ago
- ☆14Sep 14, 2023Updated 2 years ago
- ☆27Updated this week
- Codebase for VidHal: Benchmarking Hallucinations in Vision LLMs☆14Apr 19, 2025Updated 10 months ago
- ✨✨ [ICLR 2026] R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning☆281May 9, 2025Updated 9 months ago
- MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning☆770Sep 7, 2025Updated 5 months ago
- Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models☆108Jul 7, 2025Updated 7 months ago
- ☆17Feb 22, 2024Updated 2 years ago
- [NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO☆79Oct 29, 2025Updated 4 months ago
- R1-onevision, a visual language model capable of deep CoT reasoning.☆576Apr 13, 2025Updated 10 months ago
- The official code for "TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning" | [AAAI2025]☆49Mar 13, 2025Updated 11 months ago
- VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models☆77Jul 13, 2024Updated last year
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆106Dec 30, 2025Updated 2 months ago
- Controllable mage captioning model with unsupervised modes☆21Apr 14, 2023Updated 2 years ago
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction☆141Mar 6, 2025Updated 11 months ago
- [TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆149Oct 10, 2025Updated 4 months ago
- [LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)☆68May 9, 2025Updated 9 months ago
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆31Feb 22, 2026Updated last week
- ☆61Dec 5, 2025Updated 2 months ago
- Official repository for CoMM Dataset☆50Dec 31, 2024Updated last year