LatentActionPretraining / LAPA
[ICLR 2025] LAPA: Latent Action Pretraining from Videos
☆154Updated 3 weeks ago
Alternatives and similar repositories for LAPA:
Users that are interested in LAPA are comparing it to the libraries listed below
- ☆61Updated 5 months ago
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆76Updated 2 weeks ago
- ☆43Updated 2 months ago
- Official repository of Learning to Act from Actionless Videos through Dense Correspondences.☆199Updated 9 months ago
- ☆73Updated 5 months ago
- ☆91Updated 6 months ago
- Latent Motion Token as the Bridging Language for Robot Manipulation☆72Updated last week
- [ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation☆81Updated this week
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.☆120Updated last week
- OpenVLA: An open-source vision-language-action model for robotic manipulation.☆108Updated 2 weeks ago
- Embodied Chain of Thought: A robotic policy that reason to solve the task.☆143Updated 5 months ago
- 🔥[ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policy☆187Updated 2 weeks ago
- Code for subgoal synthesis via image editing☆123Updated last year
- Official implementation of GR-MG☆70Updated last month
- [ICML 2024] The offical Implementation of "DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning"☆76Updated 4 months ago
- ☆65Updated 4 months ago
- NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"☆86Updated last year
- [RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre…☆99Updated 4 months ago
- ☆34Updated 9 months ago
- Code for Reinforcement Learning from Vision Language Foundation Model Feedback☆78Updated 8 months ago
- A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks☆83Updated 2 weeks ago
- [ICCV 2023] Official code repository for ARNOLD benchmark☆152Updated 10 months ago
- A Vision-Language Model for Spatial Affordance Prediction in Robotics☆99Updated last week
- Official Implementation of ReALFRED (ECCV'24)☆35Updated 4 months ago
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆114Updated last month
- code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation☆77Updated 6 months ago
- Code for BAKU: An Efficient Transformer for Multi-Task Policy Learning☆84Updated 7 months ago
- MOKA: Open-World Robotic Manipulation through Mark-based Visual Prompting (RSS 2024)☆70Updated 7 months ago