dongyh20 / Octopus
πOctopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.
β283Updated 8 months ago
Alternatives and similar repositories for Octopus:
Users that are interested in Octopus are comparing it to the libraries listed below
- [ICLR 2024] Source codes for the paper "Building Cooperative Embodied Agents Modularly with Large Language Models"β243Updated 3 months ago
- Open Platform for Embodied Agentsβ284Updated last month
- OpenEQA Embodied Question Answering in the Era of Foundation Modelsβ254Updated 4 months ago
- [ICML 2024] Official code repository for 3D embodied generalist agent LEOβ411Updated 3 weeks ago
- Embodied Agent Interface (EAI): Benchmarking LLMs for Embodied Decision Making (NeurIPS D&B 2024 Oral)β169Updated last month
- Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Modelβ351Updated 7 months ago
- (ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Lifeβ333Updated 2 months ago
- Code for "Learning to Model the World with Language." ICML 2024 Oral.β377Updated last year
- [arXiv 2023] Embodied Task Planning with Large Language Modelsβ169Updated last year
- [ICCV'23] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Modelsβ162Updated 8 months ago
- β124Updated 7 months ago
- Implementation of "PaLM-E: An Embodied Multimodal Language Model"β285Updated last year
- [ICLR 2024 Spotlight] Code for the paper "Text2Reward: Reward Shaping with Language Models for Reinforcement Learning"β146Updated last month
- Towards Large Multimodal Models as Visual Foundation Agentsβ179Updated last week
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learningβ276Updated 2 months ago
- [NeurIPSw'24] This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simuβ¦β80Updated 3 weeks ago
- [CVPR2024] This is the official implement of MP5β94Updated 7 months ago
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D Worldβ124Updated 3 months ago
- Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agenβ¦β270Updated last year
- β44Updated last year
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chainβ103Updated 11 months ago
- (VillagerAgent ACL 2024) A Graph based Minecraft multi agents frameworkβ41Updated 3 weeks ago
- β83Updated 8 months ago
- SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. β¦β129Updated 10 months ago
- Compose multimodal datasets πΉβ279Updated last week
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasksβ60Updated last month
- Official implementation of WebVLN: Vision-and-Language Navigation on Websitesβ28Updated last year
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.β109Updated last month
- Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi eβ¦β391Updated 2 months ago
- [COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMsβ134Updated 5 months ago