ZhangXJ199/TinyLLaVA-Video-R1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZhangXJ199/TinyLLaVA-Video-R1)

ZhangXJ199 / TinyLLaVA-Video-R1

TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning

☆116

Alternatives and similar repositories for TinyLLaVA-Video-R1

Users that are interested in TinyLLaVA-Video-R1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ZhangXJ199 / TinyLLaVA-Video
View on GitHub
A Simple Framework of Small-scale LMMs for Video Understanding
☆114Jun 11, 2025Updated last year
ZhangXJ199 / Bench-CoE
View on GitHub
A Framework for Collaboration of Experts from Benchmark
☆13Apr 27, 2025Updated last year
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 10 months ago
OpenGVLab / VideoChat-R1
View on GitHub
[NIPS2025] VideoChat-R1 & R1.5: Enhancing Spatio-Temporal Perception and Reasoning via Reinforcement Fine-Tuning
☆268Oct 18, 2025Updated 9 months ago
tulerfeng / Video-R1
View on GitHub
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
☆882Dec 14, 2025Updated 7 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Dinghow / UIM
View on GitHub
The official pytorch implementation of Exploring the Interactive Guidance for Unified and Effective Image Matting [TOMM 2025]
☆25Nov 24, 2025Updated 8 months ago
zyang-ur / idea2img
View on GitHub
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation, ECCV 2024
☆22Feb 15, 2024Updated 2 years ago
TencentARC / SEED-Bench-R1
View on GitHub
☆100Jun 23, 2025Updated last year
opendatalab / LEGION
View on GitHub
[ICCV25 Highlight] The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"
☆82Oct 22, 2025Updated 9 months ago
TinyLLaVA / TinyLLaVA_Factory
View on GitHub
A Framework of Small-scale Large Multimodal Models
☆995Updated this week
Wang-Xiaodong1899 / Open-R1-Video
View on GitHub
✨First Open-Source R1-like Video-LLM [2025/02/18]
☆382Jul 1, 2026Updated 3 weeks ago
CeeZh / SILVR
View on GitHub
Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"
☆19Jan 18, 2026Updated 6 months ago
Ziyang412 / Video-RTS
View on GitHub
Code for EMNLP25 paper "Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning"
☆24Feb 18, 2026Updated 5 months ago
Hui-design / Open-LLaVA-Video-R1
View on GitHub
[LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)
☆68May 9, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SCZwangxiao / video-ReTaKe
View on GitHub
Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding
☆40Mar 16, 2025Updated last year
jylins / hourllava
View on GitHub
[NeurIPS 2025 Spotlight] Unleashing Hour-Scale Video Training for Long Video-Language Understanding
☆19Jun 24, 2025Updated last year
1229095296 / ResRL
View on GitHub
This repository includes code for our paper: ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning…
☆15May 2, 2026Updated 2 months ago
longmalongma / TW-GRPO
View on GitHub
The official repository of our paper "Reinforcing Video Reasoning with Focused Thinking"
☆36Jun 12, 2025Updated last year
opendatalab / LOKI
View on GitHub
[ICLR 2025 Spotlight] The official implementation of the paper “LOKI：A Comprehensive Synthetic Data Detection Benchmark using Large Multi…
☆180Feb 7, 2026Updated 5 months ago
thunlp / KARL
View on GitHub
KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding
☆68Apr 5, 2026Updated 3 months ago
hshjerry / VideoEspresso
View on GitHub
[CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
☆140Jul 28, 2025Updated 11 months ago
ZichenWen1 / DIJA
View on GitHub
(ICLR 2026 🔥) Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"
☆79Feb 9, 2026Updated 5 months ago
www-Ye / Time-R1
View on GitHub
R1-like Video-LLM for Temporal Grounding
☆138Jun 20, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kyegomez / MC-ViT
View on GitHub
Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"
☆27Updated this week
daeunni / Video-Skill-CoT
View on GitHub
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Findings]"
☆18Aug 27, 2025Updated 10 months ago
opendatalab / FakeVLM
View on GitHub
[NeurIPS 2025 🔥] FakeVLM: Advancing Synthetic Image Detection through Explainable Multimodal Models and Fine-Grained Artifact Analysis
☆157Sep 24, 2025Updated 10 months ago
appletea233 / Temporal-R1
View on GitHub
Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency
☆62Jun 6, 2025Updated last year
Yui010206 / CREMA
View on GitHub
[ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
☆56Jul 1, 2025Updated last year
showlab / MovieSeq
View on GitHub
[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences
☆46Mar 11, 2025Updated last year
opendatalab / UrBench
View on GitHub
[AAAI 2025]This repo contains evaluation code for the paper “UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in…
☆37Apr 10, 2025Updated last year
VectorSpaceLab / Video-XL
View on GitHub
🔥🔥First-ever hour scale video understanding models
☆626Jul 14, 2025Updated last year
SJTU-DENG-Lab / R1-Zero-VSI
View on GitHub
☆42Jun 9, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
xuyang-liu16 / GlobalCom2
View on GitHub
[AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models
☆42Jan 27, 2026Updated 5 months ago
UCSC-VLAA / VLAA-Thinking
View on GitHub
[TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
☆148Oct 10, 2025Updated 9 months ago
Hui-design / R1-Video-fixbug
View on GitHub
[Blog 1] Recording a bug of grpo_trainer in some R1 projects
☆23Feb 23, 2025Updated last year
josephzpng / DisTime
View on GitHub
DisTime: Distribution-based Time Representation for Video Large Language Models.
☆21Jul 10, 2025Updated last year
ModalMinds / MM-PRM
View on GitHub
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision
☆30May 26, 2025Updated last year
haoningwu3639 / SpatialScore
View on GitHub
[CVPR 2026 Highlight] SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
☆84May 28, 2026Updated last month
UniX-AI-Lab / WorldReasonBench
View on GitHub
WorldReasonBench: Human-Aligned Stress Testing of Video Generators as Future World-State Predictors
☆22May 19, 2026Updated 2 months ago