ykarmesh / stable-control-representations
Code for Stable Control Representations
☆23Updated 2 months ago
Alternatives and similar repositories for stable-control-representations:
Users that are interested in stable-control-representations are comparing it to the libraries listed below
- Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".☆42Updated 2 months ago
- ☆64Updated 6 months ago
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆45Updated 3 months ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆59Updated 5 months ago
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆25Updated 2 months ago
- ☆42Updated 10 months ago
- Implementation of our ICCV 2023 paper DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation☆19Updated last year
- Dreamitate: Real-World Visuomotor Policy Learning via Video Generation (CoRL 2024)☆43Updated 8 months ago
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆82Updated last year
- A paper list of world model☆25Updated 10 months ago
- List of papers on video-centric robot learning☆14Updated 3 months ago
- Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://ar…☆58Updated 5 months ago
- ☆65Updated last month
- ☆15Updated 4 months ago
- Visual Representation Learning with Stochastic Frame Prediction (ICML 2024)☆18Updated 3 months ago
- ☆17Updated 8 months ago
- ☆73Updated 6 months ago
- ☆44Updated 3 months ago
- Repo for Bring Your Own Vision-Language-Action (VLA) model, arxiv 2024☆27Updated last month
- ☆43Updated last year
- ☆13Updated 9 months ago
- [ICRA2023] Grounding Language with Visual Affordances over Unstructured Data☆41Updated last year
- LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding (CVPR 2023)☆35Updated last year
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆93Updated 4 months ago
- ☆14Updated 3 weeks ago
- Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning☆48Updated last month
- Code for the RSS 2023 paper "Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement"☆19Updated last year