2644521362 / SC-MLLM
☆19Updated 8 months ago
Alternatives and similar repositories for SC-MLLM:
Users that are interested in SC-MLLM are comparing it to the libraries listed below
- ☆42Updated last month
- [ICRA2023] Grounding Language with Visual Affordances over Unstructured Data☆38Updated last year
- ☆27Updated 4 months ago
- Official implementation of GR-MG☆67Updated 2 weeks ago
- Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"☆43Updated 9 months ago
- ☆47Updated last month
- Public release for "Explore until Confident: Efficient Exploration for Embodied Question Answering"☆41Updated 6 months ago
- ☆37Updated 3 months ago
- The official codebase for ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation(cvpr 2024)☆105Updated 6 months ago
- [CoRL 2023] XSkill: cross embodiment skill discovery☆56Updated 10 months ago
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆57Updated 2 months ago
- [ICCV'23] Learning Vision-and-Language Navigation from YouTube Videos☆48Updated last month
- Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation☆51Updated 2 weeks ago
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆69Updated last month
- ☆22Updated 7 months ago
- MOKA: Open-World Robotic Manipulation through Mark-based Visual Prompting (RSS 2024)☆67Updated 6 months ago
- [CVPR'2024] "SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution"☆57Updated 4 months ago
- ☆34Updated last year
- [IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models☆82Updated 5 months ago
- official implementation of NeurIPS 2023 paper "FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation"☆30Updated last year
- ☆43Updated 9 months ago
- ☆29Updated last year
- [RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre…☆89Updated 3 months ago
- NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"☆83Updated last year
- [CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`☆102Updated 3 months ago
- ☆62Updated 3 weeks ago
- Code for ICRA24 paper "Think, Act, and Ask: Open-World Interactive Personalized Robot Navigation" Paper//arxiv.org/abs/2310.07968 …☆22Updated 7 months ago
- ☆31Updated 2 months ago
- ☆55Updated 4 months ago