cfeng16 / this-and-that
☆18Updated 10 months ago
Alternatives and similar repositories for this-and-that:
Users that are interested in this-and-that are comparing it to the libraries listed below
- This is the official implementation of Video Generation part of This&That: Language-Gesture Controlled Video Generation for Robot Plannin…☆39Updated 3 months ago
- [ICLR 2025 Spotlight] Grounding Video Models to Actions through Goal Conditioned Exploration☆48Updated this week
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆62Updated 4 months ago
- Dreamitate: Real-World Visuomotor Policy Learning via Video Generation (CoRL 2024)☆44Updated 10 months ago
- ☆69Updated 8 months ago
- ☆29Updated 4 months ago
- Repo for Bring Your Own Vision-Language-Action (VLA) model, arxiv 2024☆27Updated 3 months ago
- ☆76Updated 8 months ago
- main augmentation script for real world robot dataset.☆35Updated last year
- Official Reporsitory of "EgoMono4D: Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos"☆21Updated last month
- Efficiently apply modification functions to RLDS/TFDS datasets.☆28Updated 11 months ago
- Code for Stable Control Representations☆24Updated last month
- [ECCV'24] Parameterized Quasi-Physical Simulators for Dexterous Manipulations Transfer☆67Updated 9 months ago
- ☆21Updated 6 months ago
- ☆99Updated 8 months ago
- List of papers on video-centric robot learning☆19Updated 5 months ago
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆30Updated 4 months ago
- [ICLR 2025🎉] This is the official implementation of paper "Robots Pre-Train Robots: Manipulation-Centric Robotic Representation from Lar…☆73Updated 3 months ago
- Streaming Diffusion Policy: Fast Policy Synthesis with Variable Noise Diffusion Models☆55Updated 7 months ago
- ☆33Updated last month
- Unfied World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets☆64Updated 3 weeks ago
- [CVPR 25] G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation☆59Updated last month
- (Incomplete version) This is an implementation of affordancellm.☆11Updated 6 months ago
- ☆13Updated 3 weeks ago
- Code for the RSS 2023 paper "Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement"☆19Updated last year
- [ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment☆76Updated 2 months ago
- Agent-to-Sim Learning Interactive Behavior from Casual Videos.☆43Updated 6 months ago
- ☆70Updated 3 weeks ago
- View-Invariant Policy Learning via Zero-Shot Novel View Synthesis (CoRL 2024)☆20Updated 4 months ago
- NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks☆36Updated this week