dongyh20 / Octopus
[ECCV2024] πOctopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.
β287Updated 11 months ago
Alternatives and similar repositories for Octopus:
Users that are interested in Octopus are comparing it to the libraries listed below
- [ICLR 2024] Source codes for the paper "Building Cooperative Embodied Agents Modularly with Large Language Models"β257Updated last month
- Open Platform for Embodied Agentsβ312Updated 3 months ago
- OpenEQA Embodied Question Answering in the Era of Foundation Modelsβ276Updated 7 months ago
- (ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Lifeβ347Updated 5 months ago
- Code for "Learning to Model the World with Language." ICML 2024 Oral.β384Updated last year
- Towards Large Multimodal Models as Visual Foundation Agentsβ209Updated last week
- [ICML 2024] Official code repository for 3D embodied generalist agent LEOβ436Updated 2 weeks ago
- [NeurIPSw'24] This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simuβ¦β88Updated 3 months ago
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D Worldβ128Updated 6 months ago
- [CVPR2024] This is the official implement of MP5β101Updated 10 months ago
- β128Updated 9 months ago
- Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Modelβ359Updated 10 months ago
- [arXiv 2023] Embodied Task Planning with Large Language Modelsβ185Updated last year
- Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agenβ¦β277Updated last year
- Embodied Agent Interface (EAI): Benchmarking LLMs for Embodied Decision Making (NeurIPS D&B 2024 Oral)β192Updated 2 months ago
- [ICCV'23] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Modelsβ182Updated last month
- Official implementation of WebVLN: Vision-and-Language Navigation on Websitesβ28Updated last year
- β101Updated 3 weeks ago
- [ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creationβ437Updated 5 months ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chainβ102Updated last year
- β44Updated last year
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasksβ91Updated 3 weeks ago
- Compose multimodal datasets πΉβ360Updated 2 weeks ago
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learningβ347Updated 4 months ago
- β86Updated 10 months ago
- Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents. (ICML 2025)β114Updated this week
- [ICLR 2025] LAPA: Latent Action Pretraining from Videosβ239Updated 3 months ago
- [COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMsβ142Updated 8 months ago
- ControlLLM: Augment Language Models with Tools by Searching on Graphsβ192Updated 9 months ago
- Implementation of "PaLM-E: An Embodied Multimodal Language Model"β300Updated last year