64327069/LVAgent

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/64327069/LVAgent)

64327069 / LVAgent

Code of LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents

☆39

Alternatives and similar repositories for LVAgent

Users that are interested in LVAgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Haiyang0226 / Symphony
View on GitHub
code of cvpr26 paper Symphony
☆17Apr 7, 2026Updated 3 months ago
MILVLG / videoarm
View on GitHub
☆26Apr 9, 2026Updated 3 months ago
SalesforceAIResearch / ActiveVideoPerception
View on GitHub
Official Code for paper "Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding""
☆18Jun 2, 2026Updated last month
Jialuo-Li / DIG
View on GitHub
[CVPR 2026] Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding
☆21Feb 21, 2026Updated 4 months ago
minghangz / OnVTG
View on GitHub
Online video temporal grounding
☆16Oct 20, 2025Updated 8 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
deep-spin / Infinite-Video
View on GitHub
\infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation
☆21Feb 14, 2025Updated last year
aiming-lab / ReAgent-V
View on GitHub
[NeurIPS'25] ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding
☆51Sep 21, 2025Updated 9 months ago
OpenGVLab / VRBench
View on GitHub
[ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videos
☆28Jun 4, 2026Updated last month
lcqysl / FrameThinker
View on GitHub
[ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"
☆50Oct 9, 2025Updated 9 months ago
EliSpectre / MM-Mem
View on GitHub
[ACL-26 (main)] From Verbatim to Gist Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video A…
☆39Apr 19, 2026Updated 3 months ago
minjoong507 / Consistency-of-Video-LLM
View on GitHub
[CVPR 2025] Official Repository of the paper "On the Consistency of Video Large Language Models in Temporal Comprehension"
☆16Oct 13, 2025Updated 9 months ago
zsgvivo / VideoZoomer
View on GitHub
☆34Feb 12, 2026Updated 5 months ago
mlvlab / BLiM
View on GitHub
Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)
☆26Aug 1, 2025Updated 11 months ago
wgcyeo / WorldMM
View on GitHub
[CVPR 2026 Highlight] WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning
☆96Jun 18, 2026Updated last month
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
fansunqi / VideoTool
View on GitHub
Official Repository for NeurIPS'25 Paper "Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task"
☆23May 18, 2026Updated 2 months ago
NUS-HPC-AI-Lab / FOCUS
View on GitHub
[ICLR 2026] FOCUS: Efficient Keyframe Selection for Long Video Understanding
☆73Apr 23, 2026Updated 2 months ago
Ziyang412 / Video-RTS
View on GitHub
Code for EMNLP25 paper "Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning"
☆24Feb 18, 2026Updated 5 months ago
AVoCaDO-Captioner / AVoCaDO
View on GitHub
https://avocado-captioner.github.io/
☆37Oct 16, 2025Updated 9 months ago
wangruohui / EfficientVideoAgent
View on GitHub
EVA: Efficient Reinforcement Learning for End-to-End Video Agent
☆26May 6, 2026Updated 2 months ago
ncTimTang / AKS
View on GitHub
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
☆227Dec 19, 2025Updated 7 months ago
fansunqi / AKeyS
View on GitHub
Agentic Keyframe Search for Video Question Answering
☆18Jun 30, 2026Updated 2 weeks ago
OpenGVLab / PVC
View on GitHub
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆54Jun 12, 2025Updated last year
MCG-NJU / ViT-TAD
View on GitHub
[CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
☆11Jun 11, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mlvlab / DeepVideoR1
View on GitHub
[NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"
☆35Feb 22, 2026Updated 4 months ago
LayneH / LEWEL
View on GitHub
[CVPR2022] Official Implementation of the paper 'Learning Where to Learn in Cross-View Self-Supervised Learning'
☆29Oct 12, 2022Updated 3 years ago
Lzq5 / UniTime
View on GitHub
Universal Video Temporal Grounding with Generative Multi-modal Large Language Models
☆56May 20, 2026Updated 2 months ago
gaostar123 / DeViL
View on GitHub
[ACM MM 2026] Detector-Empowered Video Large Language Model for Efficient Spatio-Temporal Grounding
☆18Jul 12, 2026Updated last week
Zzz99999 / Video-Zero
View on GitHub
☆19May 15, 2026Updated 2 months ago
yangruoliu / VideoDetective
View on GitHub
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
☆58May 1, 2026Updated 2 months ago
look4u-ok / video-slicer
View on GitHub
☆18Jun 18, 2024Updated 2 years ago
xiaoqian-shen / Vgent
View on GitHub
[NeurIPS 2025 Spotlight] Official PyTorch implementation of Vgent
☆48Nov 30, 2025Updated 7 months ago
weijianan1 / LogicHOI
View on GitHub
[NeurIPS2023] Neural-Logic Human-Object Interaction Detection
☆14Aug 24, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
EvolvingLMMs-Lab / ParaVT
View on GitHub
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
☆54Jun 2, 2026Updated last month
iLearn-Lab / MM23-RTQ
View on GitHub
ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model
☆15Apr 7, 2026Updated 3 months ago
nusnlp / d2vlm
View on GitHub
[ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models
☆24Apr 18, 2026Updated 3 months ago
CYWang735 / AdaTooler-V
View on GitHub
☆70Feb 27, 2026Updated 4 months ago
worldbench / VideoLucy
View on GitHub
[NeurIPS 2025] Deep Memory Backtracking for Long Video Understanding
☆68Feb 10, 2026Updated 5 months ago
Rubics-Xuan / IVG
View on GitHub
This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with H…
☆15May 21, 2024Updated 2 years ago
hyungjin-chung / VPS
View on GitHub
☆16Sep 11, 2025Updated 10 months ago