YueFan1014/VideoAgent

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YueFan1014/VideoAgent)

YueFan1014 / VideoAgent

This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)

☆320

Alternatives and similar repositories for VideoAgent

Users that are interested in VideoAgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

wxh1996 / VideoAgent
View on GitHub
☆150Apr 16, 2025Updated last year
Ziyang412 / VideoTree
View on GitHub
Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
☆165Jun 23, 2025Updated last year
hmxiong / StreamChat
View on GitHub
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆111Mar 14, 2025Updated last year
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
fansunqi / AKeyS
View on GitHub
Agentic Keyframe Search for Video Question Answering
☆18Jun 30, 2026Updated 3 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
wenhaochai / MovieChat
View on GitHub
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
☆704Jan 29, 2025Updated last year
Embodied-VideoAgent / embodied-videoagent
View on GitHub
☆49Aug 26, 2025Updated 10 months ago
kkahatapitiya / LangRepo
View on GitHub
Code for our ACL 2025 paper "Language Repository for Long Video Understanding"
☆36Jun 17, 2024Updated 2 years ago
clova-tool / CLOVA-tool
View on GitHub
☆30Jun 19, 2024Updated 2 years ago
yunlong10 / Awesome-LLMs-for-Video-Understanding
View on GitHub
🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.
☆3,246Jun 13, 2026Updated last month
MCG-NJU / VideoChat-Online
View on GitHub
[CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online
☆97Oct 7, 2025Updated 9 months ago
RUC-NLPIR / VideoDeepResearch
View on GitHub
☆155Nov 17, 2025Updated 8 months ago
boheumd / MA-LMM
View on GitHub
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
☆350Jul 19, 2024Updated 2 years ago
Becomebright / GroundVQA
View on GitHub
Official PyTorch code of GroundVQA (CVPR'24)
☆63Sep 13, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
gls0425 / LinVT
View on GitHub
LinVT: Empower Your Image-level Large Language Model to Understand Videos
☆84Dec 30, 2024Updated last year
Leon1207 / Video-RAG-master
View on GitHub
✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehensi…
☆446Jun 26, 2026Updated 3 weeks ago
ziqipang / MR-Video
View on GitHub
MR. Video: MapReduce is the Principle for Long Video Understanding
☆31Jun 18, 2026Updated last month
ttengwang / Awesome_Long_Form_Video_Understanding
View on GitHub
Awesome papers & datasets specifically focused on long-term videos.
☆381Oct 9, 2025Updated 9 months ago
OpenGVLab / InternVideo
View on GitHub
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
☆2,339Jul 2, 2026Updated 3 weeks ago
CeeZh / SILVR
View on GitHub
Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"
☆19Jan 18, 2026Updated 6 months ago
egoschema / EgoSchema
View on GitHub
☆117Dec 30, 2024Updated last year
scofield7419 / Video-of-Thought
View on GitHub
Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"
☆182Feb 25, 2025Updated last year
Becomebright / ReKV
View on GitHub
[ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
☆122Nov 4, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
huangb23 / VTimeLLM
View on GitHub
[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".
☆295Jun 13, 2024Updated 2 years ago
bytedance / tarsier
View on GitHub
Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with g…
☆548Aug 14, 2025Updated 11 months ago
EvolvingLMMs-Lab / LongVA
View on GitHub
Long Context Transfer from Language to Vision
☆407Mar 18, 2025Updated last year
JIA-Lab-research / LLaMA-VID
View on GitHub
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
☆861Jul 29, 2024Updated last year
iLearn-Lab / ACL25-AdaReTaKe
View on GitHub
Official implementation of paper AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding
☆91Apr 21, 2026Updated 3 months ago
showlab / videollm-online
View on GitHub
VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)
☆677Nov 26, 2025Updated 7 months ago
JoeLeelyf / OVO-Bench
View on GitHub
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆154Jul 24, 2025Updated 11 months ago
ziplab / LongVLM
View on GitHub
☆108Jul 30, 2024Updated last year
ncTimTang / AKS
View on GitHub
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
☆228Dec 19, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ByteDance-Seed / m3-agent
View on GitHub
☆1,423Feb 12, 2026Updated 5 months ago
md-mohaiminul / BIMBA
View on GitHub
☆29Jul 25, 2025Updated 11 months ago
OpenGVLab / Ask-Anything
View on GitHub
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
☆3,344Updated this week
IVGSZ / Flash-VStream
View on GitHub
This is the official implementation of ICCV 2025 "Flash-VStream: Efficient Real-Time Understanding for Long Video Streams"
☆286Oct 15, 2025Updated 9 months ago
mbzuai-oryx / Video-ChatGPT
View on GitHub
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the cap…
☆1,503Aug 5, 2025Updated 11 months ago
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆132Nov 4, 2025Updated 8 months ago
kahnchana / mvu
View on GitHub
🤖 [ICLR'25] Multimodal Video Understanding Framework (MVU)
☆58Jan 31, 2025Updated last year