jmwang0117 / Video4Robot
List of papers on video-centric robot learning
☆12Updated 2 months ago
Alternatives and similar repositories for Video4Robot:
Users that are interested in Video4Robot are comparing it to the libraries listed below
- Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".☆37Updated 3 weeks ago
- OVExp: Open Vocabulary Exploration for Object-Oriented Navigation☆33Updated 6 months ago
- Latent Motion Token as the Bridging Language for Robot Manipulation☆65Updated last month
- ☆46Updated 3 months ago
- ☆56Updated 4 months ago
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆22Updated 3 weeks ago
- ☆86Updated 5 months ago
- Dreamitate: Real-World Visuomotor Policy Learning via Video Generation (CoRL 2024)☆42Updated 6 months ago
- ☆42Updated last month
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆64Updated last month
- Repository for "General Flow as Foundation Affordance for Scalable Robot Learning"☆45Updated 3 weeks ago
- [RSS 2024] Learning Manipulation by Predicting Interaction☆96Updated 5 months ago
- ☆47Updated 3 weeks ago
- G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation☆28Updated last month
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆36Updated last month
- ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation☆89Updated 6 months ago
- A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and A…☆55Updated this week
- An official code repository for the paper "Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation"☆57Updated 2 weeks ago
- [NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation☆88Updated last month
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆96Updated last week
- AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation☆58Updated 2 weeks ago
- LAPA: Latent Action Pretraining from Videos☆136Updated 3 weeks ago
- Planning as In-Painting: A Diffusion-Based Embodied Task Planning Framework for Environments under Uncertainty☆19Updated last year
- This is the official implementation of Video Generation part of This&That: Language-Gesture Controlled Video Generation for Robot Plannin…☆22Updated 3 weeks ago
- Code for Stable Control Representations☆23Updated 2 weeks ago
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆80Updated last month
- [NeurIPS 2024 D&B] Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning☆55Updated 3 months ago
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆90Updated 2 months ago
- Mirage: a zero-shot cross-embodiment policy transfer method. Benchmarking code for cross-embodiment policy transfer.☆17Updated 8 months ago
- [CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`☆102Updated 3 months ago