Kiteretsu77 / This_and_That_VDM
This is the official implementation of Video Generation part of This&That: Language-Gesture Controlled Video Generation for Robot Planning (ICRA 2025)
☆38Updated 2 months ago
Alternatives and similar repositories for This_and_That_VDM:
Users that are interested in This_and_That_VDM are comparing it to the libraries listed below
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆58Updated 4 months ago
- List of papers on video-centric robot learning☆19Updated 5 months ago
- Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".☆44Updated 3 months ago
- ☆99Updated 8 months ago
- Dreamitate: Real-World Visuomotor Policy Learning via Video Generation (CoRL 2024)☆44Updated 9 months ago
- Repository for "General Flow as Foundation Affordance for Scalable Robot Learning"☆51Updated 3 months ago
- [CVPR 25] G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation☆56Updated 2 weeks ago
- ☆18Updated 9 months ago
- View-Invariant Policy Learning via Zero-Shot Novel View Synthesis (CoRL 2024)☆19Updated 3 months ago
- Unfied World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets☆54Updated this week
- Single-file implementation to advance vision-language-action (VLA) models with reinforcement learning.☆50Updated this week
- code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation☆83Updated 8 months ago
- ☆68Updated 7 months ago
- main augmentation script for real world robot dataset.☆35Updated last year
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆28Updated 3 months ago
- Official Reporsitory of "EgoMono4D: Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos"☆19Updated 3 weeks ago
- ☆66Updated last week
- Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression☆36Updated 2 months ago
- [NeurIPS 24] The implementation and dataset of LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and…☆55Updated 2 weeks ago
- [NeurIPS 2024 D&B] Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning☆70Updated 6 months ago
- ☆41Updated last year
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆37Updated 4 months ago
- Latent Motion Token as the Bridging Language for Robot Manipulation☆81Updated 3 weeks ago
- ☆59Updated last week
- ☆54Updated 3 weeks ago
- [ICLR 2025🎉] This is the official implementation of paper "Robots Pre-Train Robots: Manipulation-Centric Robotic Representation from Lar…☆67Updated 2 months ago
- ☆46Updated 4 months ago
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)☆152Updated 3 weeks ago
- Code implementation of CVPR 2024 highlight paper "PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI"☆141Updated 5 months ago
- ☆37Updated 4 months ago