boheumd / MA-LMMLinks

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

☆336

Alternatives and similar repositories for MA-LMM

Users that are interested in MA-LMM are comparing it to the libraries listed below

Sorting:

RenShuhuai-Andy / TimeChat
[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
☆393Updated 5 months ago
huangb23 / VTimeLLM
[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".
☆290Updated last year
mbzuai-oryx / Video-LLaVA
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
☆257Updated 2 months ago
mbzuai-oryx / VideoGPT-plus
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
☆288Updated 2 months ago
Ziyang412 / VideoTree
Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
☆142Updated 4 months ago
OpenGVLab / VideoChat-Flash
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
☆474Updated 4 months ago
ziplab / LongVLM
☆104Updated last year
TencentARC / ST-LLM
[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
☆150Updated last year
WisconsinAIVision / ViP-LLaVA
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
☆331Updated last year
ttengwang / Awesome_Long_Form_Video_Understanding
Awesome papers & datasets specifically focused on long-term videos.
☆321Updated 2 weeks ago
imagegridworth / IG-VLM
☆138Updated last year
YueFan1014 / VideoAgent
This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)
☆264Updated 10 months ago
OpenGVLab / unmasked_teacher
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
☆336Updated last year
OpenGVLab / video-mamba-suite
The suite of modeling video with Mamba
☆278Updated last year
EvolvingLMMs-Lab / LongVA
Long Context Transfer from Language to Vision
☆395Updated 7 months ago
apple / ml-slowfast-llava
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
☆277Updated last year
gyxxyg / VTG-LLM
[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
☆113Updated 10 months ago
llyx97 / TempCompass
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …
☆124Updated 6 months ago
wxh1996 / VideoAgent
☆116Updated 6 months ago
gyxxyg / TRACE
[ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling
☆127Updated 2 months ago
IVGSZ / Flash-VStream
This is the official implementation of ICCV 2025 "Flash-VStream: Efficient Real-Time Understanding for Long Video Streams"
☆238Updated last week
sudo-Boris / mr-Blip
Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"
☆91Updated 7 months ago
OpenGVLab / VideoChat-R1
[NIPS2025] VideoChat-R1 & R1.5: Enhancing Spatio-Temporal Perception and Reasoning via Reinforcement Fine-Tuning
☆211Updated 3 weeks ago
NeeluMadan / ViFM_Survey
Foundation Models for Video Understanding: A Survey
☆140Updated 3 months ago
WHB139426 / Grounded-Video-LLM
[EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
☆130Updated 2 months ago
TimeMarker-LLM / TimeMarker
A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
☆99Updated 10 months ago
scofield7419 / Video-of-Thought
Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"
☆168Updated 7 months ago
rese1f / MovieChat
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
☆656Updated 8 months ago
baaivision / EVE
EVE Series: Encoder-Free Vision-Language Models from BAAI
☆353Updated 3 months ago
HJYao00 / DenseConnector
【NeurIPS 2024】Dense Connector for MLLMs
☆177Updated last year