InternRobotics / MesaTaskLinks
[NeurIPS 2025 Spotlight] MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
☆55Updated last month
Alternatives and similar repositories for MesaTask
Users that are interested in MesaTask are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.☆188Updated 3 weeks ago
- CVPR 2025☆36Updated 6 months ago
- ☆39Updated last year
- VITRA: Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos☆70Updated last week
- Code implementation of CVPR 2024 highlight paper "PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI"☆178Updated 5 months ago
- [ARXIV’25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control☆82Updated 4 months ago
- [NeurIPS 24] The implementation and dataset of LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and…☆56Updated 7 months ago
- Official Reporsitory of "EgoMono4D: Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos"☆38Updated last month
- Official Implementation of paper "Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence"☆135Updated 3 months ago
- Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation☆161Updated 3 months ago
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆42Updated 11 months ago
- A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.☆198Updated 3 weeks ago
- CVPR2025 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation☆29Updated 2 months ago
- [3DV 2025] Official implementation of the paper "SceneMotifCoder: Example-driven Visual Program Learning for Generating 3D Object Arrange…☆40Updated 3 weeks ago
- [AAAI 2025] DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors☆216Updated last year
- A list of works on video generation towards world model☆172Updated 3 weeks ago
- [ECCV 2024] EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion.☆101Updated last year
- [CVPR-2025] GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding☆28Updated 2 months ago
- Code implementation of the paper 'FIction: 4D Future Interaction Prediction from Video'☆17Updated 7 months ago
- [Nips 2025] EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆121Updated 3 months ago
- [ICLR 2025 Spotlight] Grounding Video Models to Actions through Goal Conditioned Exploration☆58Updated 6 months ago
- Self-reimplemented version of 4D-LRM.☆62Updated 5 months ago
- ☆78Updated 6 months ago
- The official implementation of work "REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment".☆119Updated last year
- ☆88Updated 5 months ago
- [NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding☆66Updated last month
- Official implementation of ICCV 2025 paper "EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds".☆36Updated 4 months ago
- Physical laws underpin all existence, and harnessing them for generative modeling opens boundless possibilities for advancing science and…☆232Updated 6 months ago
- Code for "BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation", ICCV 2025.☆87Updated last month
- [ECCV 2024] Official Implementation of DragAPart: Learning a Part-Level Motion Prior for Articulated Objects.☆82Updated last year