qiujihao19/LongVideo-R1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/qiujihao19/LongVideo-R1)

qiujihao19 / LongVideo-R1

[CVPR 2026] LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding

☆50

Alternatives and similar repositories for LongVideo-R1

Users that are interested in LongVideo-R1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

qiujihao19 / Artemis
View on GitHub
[NeurIPS 2024] Artemis: Towards Referential Understanding in Complex Videos
☆27Apr 8, 2025Updated last year
mingrui-wu / OSI-Bench
View on GitHub
Official repo of From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
☆24Jun 23, 2026Updated last month
ncTimTang / AKS
View on GitHub
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
☆228Dec 19, 2025Updated 7 months ago
Time-Search / TimeSearch-R
View on GitHub
[ICLR 2026] Official code for paper: TimeSearch-R: Adaptive Temporal Search for Long-Form Video Understanding via Self-Verification Reinf…
☆27Jan 29, 2026Updated 6 months ago
sunsmarterjie / ChatterBox
View on GitHub
[AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues
☆61May 2, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jylins / videoseek
View on GitHub
[CVPR 2026] VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking
☆64Mar 23, 2026Updated 4 months ago
MILVLG / videoarm
View on GitHub
☆29Apr 9, 2026Updated 3 months ago
callsys / DynRefer
View on GitHub
[CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution
☆59Mar 4, 2025Updated last year
Jialuo-Li / DIG
View on GitHub
[CVPR 2026] Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding
☆21Feb 21, 2026Updated 5 months ago
YWenxi / think-with-images-through-self-calling
View on GitHub
official repo for `thinking with images through-self-calling`
☆26Dec 28, 2025Updated 7 months ago
sunsmarterjie / DAAS
View on GitHub
'Discretization-Aware Architecture Search' alleviates the discretization gap in one-shot differentiable NAS. DAAS has been accepted by PR…
☆20Jul 30, 2021Updated 4 years ago
MzeroMiko / XDLM
View on GitHub
[ICML 2026 Spotlight] Code for miXed Discrete Diffusion Language Model
☆27Mar 16, 2026Updated 4 months ago
agents-x-project / PyVision-RL
View on GitHub
[ICML 2026] Official implementation of "PyVision-RL: Forging Open Agentic Vision Models via RL."
☆70Feb 25, 2026Updated 5 months ago
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Haiyang0226 / Symphony
View on GitHub
code of cvpr26 paper Symphony
☆17Apr 7, 2026Updated 3 months ago
AZZMM / CC-Diff
View on GitHub
Implementation of paper "CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis"
☆28Dec 19, 2025Updated 7 months ago
EvolvingLMMs-Lab / LongVT
View on GitHub
[CVPR 2026] LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling
☆257Jun 24, 2026Updated last month
ailab-kyunghee / CM2_DVC
View on GitHub
[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval
☆66Jun 19, 2024Updated 2 years ago
Ziyang412 / Video-RTS
View on GitHub
Code for EMNLP25 paper "Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning"
☆24Feb 18, 2026Updated 5 months ago
mbzuai-oryx / Video-R2
View on GitHub
Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language Models
☆19Jan 21, 2026Updated 6 months ago
RUCAIBox / LMM-Searcher
View on GitHub
The official code of "Towards Long-horizon Agentic Multimodal Search"
☆27Apr 17, 2026Updated 3 months ago
wangruohui / EfficientVideoAgent
View on GitHub
EVA: Efficient Reinforcement Learning for End-to-End Video Agent
☆26May 6, 2026Updated 2 months ago
yaolinli / GenS
View on GitHub
[ACL 2025 Findings] GenS: Generative Frame Sampler for Long Video Understanding
☆22Aug 21, 2025Updated 11 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 5 months ago
Haorane / VideoHV-Agent
View on GitHub
Official implementation of "Think, Then Verify: A Hypothesis–Verification Multi-Agent Framework for Long Video Understanding(CVPR'2026)"
☆24Apr 16, 2026Updated 3 months ago
xiaomi-research / time-r1
View on GitHub
[NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
☆95Dec 14, 2025Updated 7 months ago
martian422 / MaskGRPO
View on GitHub
The official implementation of MaskGRPO: Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models. (ICLR 2026, arxiv…
☆19Jan 27, 2026Updated 6 months ago
hzphzp / WeGen
View on GitHub
☆27Apr 25, 2025Updated last year
ht014 / SG2HOI
View on GitHub
☆12Sep 19, 2021Updated 4 years ago
jiafei1224 / SPACE
View on GitHub
This repository contains the source codes for the paper: "SPACE: A Simulator for Physical Interactions and Causal Learning in 3D Environm…
☆16Oct 11, 2021Updated 4 years ago
wuw2019 / LoTLIP
View on GitHub
[NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
☆49Jan 14, 2025Updated last year
QiWang98 / VideoRFT
View on GitHub
[NeurIPS 2025] VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning
☆64Jan 6, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sunsmarterjie / beyond_masking
View on GitHub
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers
☆26Apr 12, 2022Updated 4 years ago
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆133Nov 4, 2025Updated 8 months ago
lbaermann / qaego4d
View on GitHub
Code and Dataset for the CVPRW Paper "Where did I leave my keys? — Episodic-Memory-Based Question Answering on Egocentric Videos"
☆31Aug 28, 2023Updated 2 years ago
TencentARC / DSR_Suite
View on GitHub
☆74Apr 21, 2026Updated 3 months ago
callsys / GMPO
View on GitHub
[ICLR 2026] Geometric-Mean Policy Optimization
☆104Jan 26, 2026Updated 6 months ago
OVAD-Benchmark / ovad-benchmark-code
View on GitHub
OVAD: Open-vocabulary Attribute Detection code
☆30Aug 28, 2023Updated 2 years ago
franciszzj / VLPrompt
View on GitHub
[IJCV 2025] VLPrompt-PSG: Vision-Language Prompting for Panoptic Scene Graph Generation
☆28Sep 24, 2024Updated last year