THUDM / MotionBenchLinks

Official code for MotionBench (CVPR 2025)

☆49

Alternatives and similar repositories for MotionBench

Users that are interested in MotionBench are comparing it to the libraries listed below

Sorting:

hmxiong / StreamChat
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆58Updated 4 months ago
JoeLeelyf / OVO-Bench
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆74Updated 3 months ago
shiyi-zh0408 / NAE_CVPR2024
Accepted by CVPR 2024
☆35Updated last year
TencentARC / TokLIP
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
☆101Updated last month
rese1f / aurora
[ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
☆113Updated last month
egolife-ai / Ego-R1
Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning
☆79Updated 2 weeks ago
seervideodiffusion / SeerVideoLDM
[ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models
☆34Updated last year
NVlabs / QLIP
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
☆76Updated 4 months ago
AndyTang15 / FLAG3Dv2
☆21Updated last year
TencentARC / Video-Holmes
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
☆60Updated this week
MCG-NJU / VideoChat-Online
[CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online
☆46Updated last week
ziqihuangg / Awesome-From-Video-Generation-to-World-Model
A list of works on video generation towards world model
☆157Updated last week
BolinLai / LEGO
[ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…
☆37Updated 4 months ago
showlab / FQGAN
FQGAN: Factorized Visual Tokenization and Generation
☆50Updated 3 months ago
OuyangKun10 / SpaceR
SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning
☆67Updated last week
aHapBean / VideoREPA
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models
☆52Updated last month
GLUS-video / GLUS
[CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…
☆45Updated 3 weeks ago
SilentView / LVD-2M
[NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"
☆62Updated 9 months ago
z-x-yang / DoraemonGPT
Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models
☆85Updated 10 months ago
hshjerry / VideoEspresso
[CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
☆93Updated last month
showlab / VideoLISA
[NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
☆123Updated 6 months ago
mll-lab-nu / TStar
☆56Updated 3 months ago
jialuli-luka / Video-MSG
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
☆21Updated 3 months ago
showlab / Impossible-Videos
ICML 2025 - Impossible Videos
☆68Updated last month
Tencent / HaploVLM
ICML2025
☆49Updated last month
showlab / Exo2Ego-V
☆49Updated 2 months ago
jh-yi / Video-Panda
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models [CVPR 2025]
☆71Updated 3 weeks ago
pittisl / PhyT2V
official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
☆38Updated 3 months ago
hu-zijing / B2-DiffuRL
[CVPR 25] A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.
☆32Updated 3 months ago
TencentARC / Divot
Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)
☆72Updated 4 months ago