dongyh20 / Octopus
πOctopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.
β281Updated 8 months ago
Alternatives and similar repositories for Octopus:
Users that are interested in Octopus are comparing it to the libraries listed below
- [ICLR 2024] Source codes for the paper "Building Cooperative Embodied Agents Modularly with Large Language Models"β242Updated 3 months ago
- OpenEQA Embodied Question Answering in the Era of Foundation Modelsβ250Updated 4 months ago
- Code for "Learning to Model the World with Language." ICML 2024 Oral.β376Updated last year
- Open Platform for Embodied Agentsβ282Updated 2 weeks ago
- (ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Lifeβ322Updated last month
- [arXiv 2023] Embodied Task Planning with Large Language Modelsβ169Updated last year
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learningβ252Updated last month
- Embodied Agent Interface (EAI): Benchmarking LLMs for Embodied Decision Making (NeurIPS D&B 2024 Oral)β157Updated 2 weeks ago
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D Worldβ124Updated 3 months ago
- Towards Large Multimodal Models as Visual Foundation Agentsβ167Updated last month
- [ICML 2024] Official code repository for 3D embodied generalist agent LEOβ402Updated last week
- Implementation of "PaLM-E: An Embodied Multimodal Language Model"β284Updated last year
- β123Updated 6 months ago
- Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Modelβ350Updated 7 months ago
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agentsβ306Updated 9 months ago
- [ICCV'23] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Modelsβ161Updated 7 months ago
- [ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creationβ410Updated last month
- β44Updated last year
- Official repo and evaluation implementation of VSI-Benchβ353Updated this week
- β83Updated 7 months ago
- [NeurIPSw'24] This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simuβ¦β77Updated this week
- This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)β163Updated last month
- Compose multimodal datasets πΉβ269Updated last month
- [ICLR 2024 Spotlight] Code for the paper "Text2Reward: Reward Shaping with Language Models for Reinforcement Learning"β141Updated last month
- Long Context Transfer from Language to Visionβ359Updated 2 months ago
- [ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Modelβ409Updated 3 months ago
- Embodied Chain of Thought: A robotic policy that reason to solve the task.β124Updated 5 months ago
- (VillagerAgent ACL 2024) A Graph based Minecraft multi agents frameworkβ41Updated this week
- Code for RoboFlamingoβ335Updated 8 months ago
- [CVPR2024] This is the official implement of MP5β93Updated 6 months ago