dongyh20 / OctopusLinks
[ECCV2024] πOctopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.
β292Updated last year
Alternatives and similar repositories for Octopus
Users that are interested in Octopus are comparing it to the libraries listed below
Sorting:
- (ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Lifeβ362Updated 11 months ago
- Open Platform for Embodied Agentsβ331Updated 9 months ago
- OpenEQA Embodied Question Answering in the Era of Foundation Modelsβ327Updated last year
- [ICLR 2024] Source codes for the paper "Building Cooperative Embodied Agents Modularly with Large Language Models"β279Updated 7 months ago
- [IROS'25 Oral & NeurIPSw'24] Official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulaβ¦β95Updated 4 months ago
- [CVPR2024] This is the official implement of MP5β105Updated last year
- Code for "Learning to Model the World with Language." ICML 2024 Oral.β397Updated 2 years ago
- β45Updated last year
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasksβ176Updated last month
- [arXiv 2023] Embodied Task Planning with Large Language Modelsβ192Updated 2 years ago
- [ICML 2024] Official code repository for 3D embodied generalist agent LEOβ464Updated 6 months ago
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D Worldβ133Updated last year
- Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Modelβ369Updated last year
- β131Updated last year
- β114Updated 6 months ago
- Official Implementation of "JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse"β102Updated 2 months ago
- GROOT: Learning to Follow Instructions by Watching Gameplay Videos (ICLR 2024 Spotlight)β66Updated last year
- [ICML 2025 Oral] Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents.β205Updated last week
- Implementation of "PaLM-E: An Embodied Multimodal Language Model"β329Updated last year
- [ICCV'23] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Modelsβ205Updated 7 months ago
- Official implementation of WebVLN: Vision-and-Language Navigation on Websitesβ30Updated last year
- Towards Large Multimodal Models as Visual Foundation Agentsβ241Updated 6 months ago
- β95Updated last year
- GPT-4V in Wonderland: LMMs as Smartphone Agentsβ135Updated last year
- Embodied Agent Interface (EAI): Benchmarking LLMs for Embodied Decision Making (NeurIPS D&B 2024 Oral)β264Updated 7 months ago
- [ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creationβ460Updated 11 months ago
- β32Updated last year
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chainβ103Updated last year
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learningβ394Updated 10 months ago
- Evaluate Multimodal LLMs as Embodied Agentsβ54Updated 8 months ago