video-language-planning / vlp_code
☆75Updated 7 months ago
Alternatives and similar repositories for vlp_code:
Users that are interested in vlp_code are comparing it to the libraries listed below
- ☆67Updated 6 months ago
- Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".☆44Updated 3 months ago
- ☆63Updated 5 months ago
- Codebase for HiP☆89Updated last year
- ☆43Updated last year
- ☆94Updated 7 months ago
- ☆46Updated 3 months ago
- Code for subgoal synthesis via image editing☆130Updated last year
- [ECCV 2024] 💐Official implementation of the paper "Diffusion Reward: Learning Rewards via Conditional Video Diffusion"☆96Updated 9 months ago
- Repo for Bring Your Own Vision-Language-Action (VLA) model, arxiv 2024☆27Updated 2 months ago
- ☆59Updated 2 weeks ago
- Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://ar…☆59Updated 6 months ago
- ☆66Updated 5 months ago
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆52Updated 3 months ago
- Streaming Diffusion Policy: Fast Policy Synthesis with Variable Noise Diffusion Models☆52Updated 6 months ago
- ☆28Updated 3 weeks ago
- Dreamitate: Real-World Visuomotor Policy Learning via Video Generation (CoRL 2024)☆44Updated 9 months ago
- Official repository of Learning to Act from Actionless Videos through Dense Correspondences.☆205Updated 11 months ago
- Code release for paper "Autonomous Improvement of Instruction Following Skills via Foundation Models" | CoRL 2024☆70Updated 2 months ago
- Code Repository for "Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations"☆49Updated 3 months ago
- NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"☆90Updated 2 years ago
- official implementation for our paper Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance (CoRL 2024)☆20Updated last month
- [NeurIPS 2024] GenRL: Multimodal-foundation world models enable grounding language and video prompts into embodied domains, by turning th…☆73Updated 2 months ago
- Evaluate Multimodal LLMs as Embodied Agents☆39Updated last month
- Official code for "QueST: Self-Supervised Skill Abstractions for Continuous Control" [NeurIPS 2024]☆74Updated 4 months ago
- Unified Video Action Model☆128Updated 2 weeks ago
- VP2 Benchmark (A Control-Centric Benchmark for Video Prediction, ICLR 2023)☆27Updated last month
- Official implementation of "Self-Improving Video Generation"☆62Updated last month
- main augmentation script for real world robot dataset.☆35Updated last year
- code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation☆81Updated 8 months ago