EvolvingLMMs-Lab/LongVT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/EvolvingLMMs-Lab/LongVT)

EvolvingLMMs-Lab / LongVT

[CVPR 2026] LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

☆254

Alternatives and similar repositories for LongVT

Users that are interested in LongVT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

EvolvingLMMs-Lab / OpenMMReasoner
View on GitHub
[CVPR 2026] OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
☆164Mar 30, 2026Updated 3 months ago
EvolvingLMMs-Lab / ParaVT
View on GitHub
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
☆54Jun 2, 2026Updated last month
XIAO4579 / PRISM
View on GitHub
Beyond SFT-to-RL: Pre-alignment via Black-BoxOn-Policy Distillation for Multimodal RL
☆96May 6, 2026Updated 2 months ago
zhang9302002 / ThinkingWithVideos
View on GitHub
The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"
☆101Oct 15, 2025Updated 9 months ago
zjuruizhechen / Awesome-Video-Agent
View on GitHub
A collection of awesome think with videos papers.
☆100Dec 1, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
UniX-AI-Lab / WorldReasonBench
View on GitHub
WorldReasonBench: Human-Aligned Stress Testing of Video Generators as Future World-State Predictors
☆22May 19, 2026Updated 2 months ago
marinero4972 / Open-o3-Video
View on GitHub
[ICML 2026] Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"
☆157May 1, 2026Updated 2 months ago
wangruohui / EfficientVideoAgent
View on GitHub
EVA: Efficient Reinforcement Learning for End-to-End Video Agent
☆26May 6, 2026Updated 2 months ago
lcqysl / FrameThinker
View on GitHub
[ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"
☆50Oct 9, 2025Updated 9 months ago
EvolvingLMMs-Lab / SimpleStream
View on GitHub
A simple video streaming baseline that outperforms SOTAs.
☆148May 1, 2026Updated 2 months ago
RUC-NLPIR / VideoDeepResearch
View on GitHub
☆155Nov 17, 2025Updated 8 months ago
EvolvingLMMs-Lab / OneVision-Encoder
View on GitHub
Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence
☆385Jun 20, 2026Updated last month
NVlabs / Long-RL
View on GitHub
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
☆726Sep 24, 2025Updated 9 months ago
shijian2001 / Video-Thinker
View on GitHub
Sparking "Thinking with Videos" via Reinforcement Learning
☆161Oct 30, 2025Updated 8 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
qiujihao19 / LongVideo-R1
View on GitHub
[CVPR 2026] LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding
☆50Jul 7, 2026Updated last week
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 4 months ago
EvolvingLMMs-Lab / Evolving-Visual-Generation
View on GitHub
[Roadmap] Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
☆124Jun 9, 2026Updated last month
EvolvingLMMs-Lab / LLaVA-OneVision-2
View on GitHub
Fully Open Framework for Democratized Multimodal Training
☆1,141Updated this week
w-yibo / VTC-R1
View on GitHub
VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning.
☆26Feb 20, 2026Updated 5 months ago
EvolvingLMMs-Lab / lmms-engine
View on GitHub
A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.
☆805Updated this week
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,491Mar 9, 2026Updated 4 months ago
Koreyoshi01 / VISD
View on GitHub
This repository is the official implementation for VISD.
☆21May 17, 2026Updated 2 months ago
Luodian / nano-hevc
View on GitHub
A minimal, educational HEVC (H.265) encoder written in Python.
☆53Feb 23, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MCG-NJU / Video-o3
View on GitHub
[ICML 2026] Video-o3: Native Interleaved Clue Seeking for Long Video Multi-Hop Reasoning
☆130Jul 2, 2026Updated 2 weeks ago
fansunqi / VideoTool
View on GitHub
Official Repository for NeurIPS'25 Paper "Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task"
☆23May 18, 2026Updated 2 months ago
egolife-ai / Ego-R1
View on GitHub
[TPAMI 2026] Ego-R1: Agentic Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning
☆165Jun 10, 2026Updated last month
Visual-Agent / DeepEyes
View on GitHub
☆1,247Nov 20, 2025Updated 8 months ago
LJungang / Awesome-Video-Reasoning-Landscape
View on GitHub
🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.
☆188Jun 14, 2026Updated last month
hshjerry / VideoEspresso
View on GitHub
[CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
☆140Jul 28, 2025Updated 11 months ago
EvolvingLMMs-Lab / EASI
View on GitHub
Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
☆117Jul 1, 2026Updated 2 weeks ago
Infinity-AILab / DeepResearchEval
View on GitHub
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation.
☆141Feb 10, 2026Updated 5 months ago
EvolvingLMMs-Lab / multimodal-search-r1
View on GitHub
[ACL-2026] MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal…
☆469Apr 7, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
TencentARC / TimeLens
View on GitHub
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
☆158Apr 27, 2026Updated 2 months ago
zsgvivo / VideoZoomer
View on GitHub
☆34Feb 12, 2026Updated 5 months ago
NJU-LINK / OmniVideoBench
View on GitHub
The Source Code for OmniVideoBench @ICLR 2026
☆76Feb 12, 2026Updated 5 months ago
EvolvingLMMs-Lab / multimodal-sae
View on GitHub
[ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.
☆199Sep 26, 2025Updated 9 months ago
microsoft / DeepVideoDiscovery
View on GitHub
**Deep Video Discovery (DVD)** is a deep-research style question answering agent designed for understanding extra-long videos.
☆403Nov 3, 2025Updated 8 months ago
cyuQ1n / EasyVideoR1
View on GitHub
☆155Apr 27, 2026Updated 2 months ago
wgcyeo / WorldMM
View on GitHub
[CVPR 2026 Highlight] WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning
☆96Jun 18, 2026Updated last month