Dantong88 / LLARVA
☆42Updated 2 months ago
Alternatives and similar repositories for LLARVA:
Users that are interested in LLARVA are comparing it to the libraries listed below
- Latent Motion Token as the Bridging Language for Robot Manipulation☆71Updated last week
- Official implementation of GR-MG☆68Updated last month
- [ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation☆72Updated this week
- [RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre…☆93Updated 4 months ago
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆75Updated last week
- Repository for "General Flow as Foundation Affordance for Scalable Robot Learning"☆46Updated last month
- [ICLR 2025] LAPA: Latent Action Pretraining from Videos☆151Updated 3 weeks ago
- MOKA: Open-World Robotic Manipulation through Mark-based Visual Prompting (RSS 2024)☆67Updated 7 months ago
- Official repository of Learning to Act from Actionless Videos through Dense Correspondences.☆196Updated 9 months ago
- [ICRA2023] Grounding Language with Visual Affordances over Unstructured Data☆38Updated last year
- RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning☆23Updated 4 months ago
- ☆60Updated 5 months ago
- ☆89Updated 6 months ago
- ☆47Updated last month
- Reimplementation of GR-1, a generalized policy for robotics manipulation.☆115Updated 5 months ago
- AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation☆61Updated 3 weeks ago
- Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"☆43Updated 9 months ago
- ☆65Updated 3 months ago
- [CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`☆104Updated 4 months ago
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆113Updated 3 weeks ago
- Efficiently apply modification functions to RLDS/TFDS datasets.☆24Updated 8 months ago
- ☆27Updated 4 months ago
- code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation☆76Updated 6 months ago
- Official implementation of "Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy."☆49Updated this week
- ☆73Updated 5 months ago
- A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation☆162Updated this week
- Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".☆41Updated last month
- A Vision-Language Model for Spatial Affordance Prediction in Robotics☆97Updated last week