[NeurIPS 2025] VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning
☆66Jan 6, 2026Updated 5 months ago
Alternatives and similar repositories for VideoRFT
Users that are interested in VideoRFT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV2024] Nonverbal Interaction Detection☆31Oct 30, 2024Updated last year
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆35Feb 22, 2026Updated 3 months ago
- The official repository of our paper "Reinforcing Video Reasoning with Focused Thinking"☆36Jun 12, 2025Updated last year
- [ACL 2023] Transforming Visual Scene Graphs to Image Captions☆10Dec 13, 2023Updated 2 years ago
- the official code for Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning☆46Nov 26, 2025Updated 6 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [NeurIPS 2025] Reasoning MLLM, Share-GRPO, advantage vanishing, sparse reward☆37Sep 19, 2025Updated 8 months ago
- ☆13Jul 10, 2024Updated last year
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆25Jul 30, 2025Updated 10 months ago
- FFNet: MetaMixer-based Efficient Convolutional Mixer Design☆34Mar 11, 2025Updated last year
- [ICLR 2025] Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception☆15Jul 4, 2025Updated 11 months ago
- 🔥🔥🔥 Latest Papers, Codes and Datasets on Video-LMM Post-Training☆290Mar 3, 2026Updated 3 months ago
- ☆15Mar 30, 2025Updated last year
- [CVPR 25] A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.☆53Mar 31, 2025Updated last year
- LLaVA-Next for STVG☆21Dec 5, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆17May 8, 2025Updated last year
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆94Jul 13, 2025Updated 11 months ago
- Semantic-Guided Diffusion Model for Single-Step Image Super-Resolution☆25Jun 10, 2025Updated last year
- Official code of ACM MM2024 paper- Unseen No More: Unlocking the Potential of CLIP for Generative Zero-shot HOI Detection☆24Aug 15, 2024Updated last year
- 洛谷 API 文档☆13Nov 15, 2025Updated 7 months ago
- [ICLR 2025, AAAI 2026] official implementation of "Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generati…☆39Jan 26, 2026Updated 4 months ago
- [ECCV 2024] OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models☆50Jan 8, 2025Updated last year
- EgoToM is an egocentric theory-of-mind benchmark built on Ego4D videos, containing multi-choice questions that evaluate multimodal large …☆16Apr 1, 2025Updated last year
- Implementation of AdaCQR(COLING 2025)☆15Dec 30, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [ICML 2025 Oral] An official implementation of VideoRoPE & VideoRoPE++☆221Apr 15, 2026Updated 2 months ago
- [CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"☆35Feb 2, 2024Updated 2 years ago
- Official implementation for the paper "Self-Play Reinforcement Learning for Fast Image Retargeting"☆10Oct 5, 2020Updated 5 years ago
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆37May 9, 2026Updated last month
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.☆86May 4, 2025Updated last year
- Official Implementation of wd1☆30Sep 25, 2025Updated 8 months ago
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation☆36Feb 28, 2026Updated 3 months ago
- Official implementation of "UniMedVL: Unifying Medical Multimodal Understanding and Generation through Observation-Knowledge-Analysis" - …☆94Jun 5, 2026Updated last week
- PyTorch implementation of "HERO: Human Reaction Generation from Videos (ICCV 2025)"☆34Mar 27, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆26Nov 20, 2025Updated 6 months ago
- Official implementary of HCoG: Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation [CVPR 2025]☆58Jul 28, 2025Updated 10 months ago
- Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.☆18Dec 19, 2024Updated last year
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".☆95Jul 10, 2025Updated 11 months ago
- ☆28Mar 3, 2025Updated last year
- Repository of GUI Action Narrator☆13Apr 8, 2025Updated last year
- Implementation of "Spectral Feature Tansformation for Person Re-identification"☆31Sep 7, 2019Updated 6 years ago