This is a collection of recent papers on reasoning in video generation models.
☆136Feb 28, 2026Updated last week
Alternatives and similar repositories for Awesome-Video-Reasoning
Users that are interested in Awesome-Video-Reasoning are comparing it to the libraries listed below
Sorting:
- This is a framework for evaluating reasoning in foundational Video Models.☆74Feb 24, 2026Updated last week
- [AAAI 2026] Official implementation of the paper ”SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D F…☆35Jan 8, 2026Updated 2 months ago
- DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data☆39Dec 12, 2025Updated 2 months ago
- A collection of awesome think with videos papers.☆91Dec 1, 2025Updated 3 months ago
- ASID-Caption: Attribute-Structured and Quality-Verified Audiovisual Instruction Dataset and Training Pipeline for Fine-Grained Video Unde…☆35Updated this week
- [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models☆20Jul 17, 2024Updated last year
- Scripts for installing ROS Noetic on Ubuntu 22.04☆22Mar 17, 2023Updated 2 years ago
- Official Implementations for Paper - MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues☆128Dec 3, 2025Updated 3 months ago
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆35Jul 15, 2025Updated 7 months ago
- ☆54Sep 21, 2025Updated 5 months ago
- 🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.☆153Jan 16, 2026Updated last month
- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?☆42Nov 1, 2024Updated last year
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆134Nov 4, 2025Updated 4 months ago
- Beyond Accuracy: What Matters in Designing Well-Behaved Models?☆18Updated this week
- [ICCV 2025 Workshop Outstanding Paper Award] VChain: Chain-of-Visual-Thought for Reasoning in Video Generation☆115Oct 7, 2025Updated 5 months ago
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.☆84Dec 24, 2025Updated 2 months ago
- (ICLR 2026 🔥) Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"☆74Feb 9, 2026Updated 3 weeks ago
- Benchmark dataset and code of MSRVTT-Personalization☆52Nov 10, 2025Updated 3 months ago
- VideoNSA: Native Sparse Attention Scales Video Understanding☆81Nov 16, 2025Updated 3 months ago
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆21Jan 6, 2026Updated 2 months ago
- Are Video Models Ready as Zero-shot Reasoners?☆84Nov 24, 2025Updated 3 months ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated last year
- Official Implementation of MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models☆12Nov 1, 2025Updated 4 months ago
- PyTorch Implementation for the paper "Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation" accepted to RA-L'24.☆12Nov 27, 2024Updated last year
- Official implementation of SPGrasp: A framework for dynamic grasp synthesis from sparse spatiotemporal prompts.☆19Jan 6, 2026Updated 2 months ago
- A benchmark of Python Library Migration☆14Apr 5, 2025Updated 11 months ago
- ☆10Jul 13, 2024Updated last year
- Our repo containes a Efficient RGB-D features extractor to category-level and instance-level 6D pose estimation.☆14Oct 29, 2025Updated 4 months ago
- [ICLR 2026] Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding☆30Jan 27, 2026Updated last month
- This is the official codebase for paper: Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Acti…☆39Feb 24, 2026Updated last week
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆57Sep 12, 2025Updated 5 months ago
- [ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" and "Sp…☆243Dec 22, 2025Updated 2 months ago
- The public reproducible analysis code used for the gaze project☆11Feb 21, 2026Updated 2 weeks ago
- NeurIPS 2024: Bidirectional Recurrence for Cardiac Motion Tracking with Gaussian Process Latent Coding☆16Jun 20, 2025Updated 8 months ago
- ☆13Jun 4, 2025Updated 9 months ago
- [ACL 2025 Main] (🏆 Outstanding Paper Award) Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Proba…☆16Aug 15, 2025Updated 6 months ago
- [NeurIPS 2023] Official pytorch implementation of "Domain Re-Modulation for Few-Shot Generative Domain Adaption"☆13Aug 2, 2024Updated last year
- ☆12Nov 10, 2020Updated 5 years ago
- [RAL 2025] MTIL: Encoding Full History with Mamba for Temporal Imitation Learning☆27Nov 17, 2025Updated 3 months ago