Dantong88 / LLARVA
☆20Updated 3 months ago
Related projects: ⓘ
- [ICRA2023] Grounding Language with Visual Affordances over Unstructured Data☆34Updated 10 months ago
- ☆63Updated last month
- Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"☆43Updated 5 months ago
- [CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`☆63Updated this week
- Dreamitate: Real-World Visuomotor Policy Learning via Video Generation☆38Updated 2 months ago
- Theia: Distilling Diverse Vision Foundation Models for Robot Learning☆141Updated 2 weeks ago
- [RSS 2024] Learning Manipulation by Predicting Interaction☆78Updated last month
- [ICLR 2023] SQA3D for embodied scene understanding and reasoning☆115Updated 11 months ago
- [CoRL 2023 Oral] GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields☆111Updated 8 months ago
- ☆41Updated 2 weeks ago
- NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"☆79Updated last year
- ☆34Updated 4 months ago
- ☆39Updated 8 months ago
- [RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre…☆51Updated 2 months ago
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆43Updated 3 months ago
- [CoRL2023] Official PyTorch implementation of PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation☆29Updated 3 months ago
- ☆58Updated 11 months ago
- [ECCV 2024] 🎉 Official repository of "Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipu…☆34Updated 8 months ago
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World☆115Updated 6 months ago
- ☆32Updated last week
- ☆35Updated 2 weeks ago
- ☆32Updated last month
- Official implementation of "SUGAR: Pre-training 3D Visual Representations for Robotics" (CVPR'24).☆27Updated 2 months ago
- code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation☆56Updated last month
- Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"☆95Updated 4 months ago
- ☆19Updated 6 months ago
- Official implementation of Learning from Unlabeled 3D Environments for Vision-and-Language Navigation (ECCV'22).☆32Updated last year
- main augmentation script for real world robot dataset.☆26Updated last year
- OVExp: Open Vocabulary Exploration for Object-Oriented Navigation☆28Updated 2 months ago
- Official implementation of GROOT, CoRL 2023☆45Updated 10 months ago