dongyh20 / OctopusLinks
[ECCV2024] πOctopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.
β292Updated last year
Alternatives and similar repositories for Octopus
Users that are interested in Octopus are comparing it to the libraries listed below
Sorting:
- OpenEQA Embodied Question Answering in the Era of Foundation Modelsβ306Updated 10 months ago
- Open Platform for Embodied Agentsβ326Updated 6 months ago
- (ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Lifeβ359Updated 8 months ago
- [ICLR 2024] Source codes for the paper "Building Cooperative Embodied Agents Modularly with Large Language Models"β265Updated 4 months ago
- [IROS'25 Oral & NeurIPSw'24] Official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulaβ¦β91Updated last month
- [CVPR2024] This is the official implement of MP5β103Updated last year
- β44Updated last year
- Code for "Learning to Model the World with Language." ICML 2024 Oral.β387Updated last year
- β109Updated 4 months ago
- [ICML 2024] Official code repository for 3D embodied generalist agent LEOβ451Updated 3 months ago
- β131Updated last year
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasksβ78Updated last month
- Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Modelβ367Updated last year
- [arXiv 2023] Embodied Task Planning with Large Language Modelsβ188Updated last year
- (VillagerAgent ACL 2024) A Graph based Minecraft multi agents frameworkβ68Updated last month
- [ICML 2025 Oral] Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents.β172Updated 3 weeks ago
- β92Updated last year
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D Worldβ130Updated 9 months ago
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasksβ158Updated 2 months ago
- [ICCV'23] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Modelsβ196Updated 4 months ago
- Pandora: Towards General World Model with Natural Language Actions and Video Statesβ510Updated 10 months ago
- Official implementation of WebVLN: Vision-and-Language Navigation on Websitesβ28Updated last year
- Official Implementation of "JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse"β89Updated 2 months ago
- Implementation of "PaLM-E: An Embodied Multimodal Language Model"β316Updated last year
- Embodied Agent Interface (EAI): Benchmarking LLMs for Embodied Decision Making (NeurIPS D&B 2024 Oral)β225Updated 5 months ago
- Towards Large Multimodal Models as Visual Foundation Agentsβ225Updated 3 months ago
- GROOT: Learning to Follow Instructions by Watching Gameplay Videos (ICLR 2024 Spotlight)β66Updated last year
- Virtual Community: An Open World for Humans, Robots, and Societyβ150Updated 2 weeks ago
- Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agenβ¦β282Updated 2 years ago
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learningβ378Updated 7 months ago