Liuziyu77 / Visual-RFTLinks
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
☆1,980Updated last month
Alternatives and similar repositories for Visual-RFT
Users that are interested in Visual-RFT are comparing it to the libraries listed below
Sorting:
- A fork to add multimodal model training to open-r1☆1,306Updated 4 months ago
- This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-sta…☆607Updated last week
- ☆1,206Updated this week
- Solve Visual Understanding with Reinforced VLMs☆5,159Updated last month
- MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning☆654Updated 3 weeks ago
- This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…☆908Updated this week
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆770Updated last month
- Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"☆418Updated last week
- Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey☆655Updated last month
- R1-onevision, a visual language model capable of deep CoT reasoning.☆528Updated 2 months ago
- EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL☆2,691Updated this week
- Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]☆569Updated 3 weeks ago
- An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.☆826Updated last week
- VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)☆486Updated last month
- Explore the Multimodal “Aha Moment” on 2B Model☆592Updated 3 months ago
- R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization☆399Updated this week
- ☆504Updated this week
- Awesome RL Reasoning Recipes ("Triple R")☆674Updated last week
- Align Anything: Training All-modality Model with Feedback☆3,969Updated 3 weeks ago
- A Framework of Small-scale Large Multimodal Models☆836Updated last month
- Official Repo for Open-Reasoner-Zero☆1,967Updated 2 weeks ago
- Next-Token Prediction is All You Need☆2,149Updated 3 months ago
- Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conv…☆451Updated 3 months ago
- MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka☆307Updated last month
- Easy Data Cleaning, Augmentation and Evaluation with latest LLMs based Operators and Pipelines.☆285Updated this week
- minimal-cost for training 0.5B R1-Zero☆742Updated last month
- ☆363Updated 4 months ago
- Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities☆893Updated 2 months ago
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey☆446Updated 5 months ago
- Reproduce R1 Zero on Logic Puzzle☆2,355Updated 3 months ago