zhang9302002 / ThinkingWithVideosLinks
The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"
☆18Updated this week
Alternatives and similar repositories for ThinkingWithVideos
Users that are interested in ThinkingWithVideos are comparing it to the libraries listed below
Sorting:
- ☆21Updated last year
- GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography☆72Updated 3 weeks ago
- A list of works on video generation towards world model☆162Updated 2 weeks ago
- [ARXIV’25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control☆76Updated last month
- ☆39Updated this week
- Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation☆127Updated last month
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆21Updated 4 months ago
- open-sourced video dataset with dynamic scenes and camera movements annotation☆71Updated 4 months ago
- A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.☆159Updated last week
- (ECCV 2024) Official implementation of Paper ''DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation''☆39Updated 10 months ago
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆39Updated 3 months ago
- [ICLR 2025] Trajectory Attention For Fine-grained Video Motion Control☆92Updated 3 months ago
- [ICLR 2025] Layout-Your-3D: Controllable and Precise 3D Generation with 2D Blueprint☆14Updated last month
- A survey for visual generation alignment☆51Updated 2 weeks ago
- A curated list of awesome autoregressive papers in Generative AI☆100Updated last week
- Self-reimplemented version of 4D-LRM.☆50Updated 2 months ago
- [CVPR 2024] Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training☆41Updated last year
- SceneCompleter: Dense 3D Scene Completion for Generative Novel View Synthesis☆35Updated 2 months ago
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆113Updated 3 weeks ago
- Official Code for 'AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction' (ICCV 2025)☆53Updated last month
- Accepted by CVPR 2024☆37Updated last year
- [ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness☆50Updated last month
- WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes☆98Updated 5 months ago
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller☆45Updated 3 weeks ago
- The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'☆107Updated 3 weeks ago
- The official implementation of work "REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment".☆116Updated 11 months ago
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆42Updated last month
- Official implementation of "Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals"☆113Updated last month
- [CVPR2025] Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation☆120Updated last month
- Official Code for 'TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction' (ICCV 2025)☆73Updated last month