ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
☆91Sep 12, 2025Updated 5 months ago
Alternatives and similar repositories for ShotBench
Users that are interested in ShotBench are comparing it to the libraries listed below
Sorting:
- Consistent Human Image and Video Generation with Spatially Conditioned Diffusion☆15Sep 1, 2025Updated 6 months ago
- [ICCV 2025] Official Implementation of "Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation". Junyu Xie, Tengda H…☆21Jul 26, 2025Updated 7 months ago
- Learning to cut end-to-end pretrained modules☆35Apr 17, 2025Updated 10 months ago
- RealisMotion: Decomposed Human Motion Control and Video Generation in the World Space☆39Oct 16, 2025Updated 4 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- ☆13May 17, 2025Updated 9 months ago
- VideoAuteur: Towards Long Narrative Video Generation☆42Oct 22, 2025Updated 4 months ago
- The official code implementation of "LaCon: Late-Constraint Diffusion for Steerable Guided Image Synthesis".☆37Dec 11, 2025Updated 2 months ago
- ☆13Jul 10, 2024Updated last year
- Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)☆21Jul 16, 2025Updated 7 months ago
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆88Jul 13, 2025Updated 7 months ago
- ☆32May 3, 2024Updated last year
- DreamCinema: Cinematic Transfer with Free Camera and 3D Character☆95Jun 13, 2025Updated 8 months ago
- ☆16Jun 14, 2024Updated last year
- [CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆137Jul 28, 2025Updated 7 months ago
- ☆58Oct 19, 2025Updated 4 months ago
- The official implementation of 'GRID: Visual Layout Generation.'☆21Dec 28, 2024Updated last year
- Official Implementation of "Video Camera Trajectory Editing with Generative Rendering from Estimated Geometry"☆31Nov 10, 2025Updated 3 months ago
- DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging☆47Apr 27, 2025Updated 10 months ago
- [NeurIPS 2025 Spotlight] Towards Understanding Camera Motions in Any Video☆270Nov 24, 2025Updated 3 months ago
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…☆64Jan 27, 2026Updated last month
- (NeurIPS 2024) BiDM: Pushing the Limit of Quantization for Diffusion Models☆22Nov 20, 2024Updated last year
- A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.☆74Oct 14, 2024Updated last year
- ☆17Feb 20, 2025Updated last year
- Quick Long Video Understanding [TMLR2025]☆76Oct 27, 2025Updated 4 months ago
- ☆27Jun 3, 2025Updated 9 months ago
- ☆139Nov 17, 2025Updated 3 months ago
- ☆28Mar 4, 2025Updated last year
- [NeurIPS 2025] Controllable Human-centric Keyframe Interpolation with Generative Prior☆30Dec 31, 2025Updated 2 months ago
- A light-weight and high-efficient training framework for accelerating diffusion tasks.☆51Sep 14, 2024Updated last year
- ☆24Nov 1, 2024Updated last year
- Official code release for ICCV2025 paper (Highlight): MoGA: 3D Generative Avatar Prior for Monocular Gaussian Avatar Reconstruction☆47Oct 20, 2025Updated 4 months ago
- T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation☆36Sep 16, 2025Updated 5 months ago
- [ACM Multimedia 2025 Datasets Track] EditWorld: Simulating World Dynamics for Instruction-Following Image Editing☆139Aug 2, 2025Updated 7 months ago
- ☆48Mar 12, 2025Updated 11 months ago
- ☆27Mar 3, 2025Updated last year
- ☆31Jul 16, 2025Updated 7 months ago
- Chain-of-Frames [CVPR 2026]☆38Jul 2, 2025Updated 8 months ago
- [ICCV 2025] MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance☆178Feb 11, 2026Updated 3 weeks ago