JARVIS-1: Open-world Multi-task Agents with Memory-Augmented Multimodal Language Models
☆390Apr 8, 2024Updated last year
Alternatives and similar repositories for JARVIS-1
Users that are interested in JARVIS-1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- GROOT: Learning to Follow Instructions by Watching Gameplay Videos (ICLR'24, Spotlight)☆67Dec 18, 2023Updated 2 years ago
- Implementation of "Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction"☆46Aug 15, 2023Updated 2 years ago
- Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agen…☆293Aug 3, 2023Updated 2 years ago
- STEVE-1: A Generative Model for Text-to-Behavior in Minecraft☆204Jun 4, 2024Updated last year
- [IROS'25 Oral & NeurIPSw'24] Official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simula…☆104Jun 16, 2025Updated 9 months ago
- ☆31Jun 25, 2024Updated last year
- [CVPR2024] This is the official implement of MP5☆108Jun 30, 2024Updated last year
- Official Implementation of Paper "ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment" (AAAI'26)☆41Jul 2, 2025Updated 8 months ago
- Paper List of Minecraft Agents☆58Mar 6, 2026Updated 2 weeks ago
- ☁️ KUMO: Generative Evaluation of Complex Reasoning in Large Language Models☆19Jun 4, 2025Updated 9 months ago
- MineStudio: A Streamlined Package for Minecraft AI Agent Development☆354Feb 7, 2026Updated last month
- Official implementation of paper "ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting" (CVPR'25)☆46Apr 13, 2025Updated 11 months ago
- The official implementation of the paper "Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction".☆35Feb 10, 2024Updated 2 years ago
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆96Jun 17, 2025Updated 9 months ago
- ☆37Oct 21, 2025Updated 5 months ago
- An Open-Ended Embodied Agent with Large Language Models☆6,760Apr 3, 2024Updated last year
- Foundation Model for MineDojo☆297Apr 2, 2023Updated 2 years ago
- ☆47Dec 11, 2023Updated 2 years ago
- Official Implementation of "JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse"☆141Aug 27, 2025Updated 6 months ago
- We introduce ADAM, An emboDied causal Agent in Minecraft, that can autonomously navigate the open world, perceive multimodal contexts, le…☆27Apr 7, 2025Updated 11 months ago
- ☆88Dec 15, 2023Updated 2 years ago
- ☆99Jun 12, 2024Updated last year
- LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills☆764Feb 1, 2024Updated 2 years ago
- Building Open-Ended Embodied Agents with Internet-Scale Knowledge☆2,166Mar 18, 2024Updated 2 years ago
- Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memo…☆638Jun 5, 2023Updated 2 years ago
- Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos☆1,663Sep 3, 2025Updated 6 months ago
- Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs☆111Sep 30, 2025Updated 5 months ago
- Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning☆11Jul 20, 2022Updated 3 years ago
- GPT-4V in Wonderland: LMMs as Smartphone Agents☆134Jul 17, 2024Updated last year
- Official implementation of the DECKARD Agent from the paper "Do Embodied Agents Dream of Pixelated Sheep?"☆94May 23, 2023Updated 2 years ago
- Me☆30Feb 11, 2023Updated 3 years ago
- [ECCV2024] 🐙Octopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.☆297May 20, 2024Updated last year
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.☆1,759Sep 9, 2024Updated last year
- [NeurIPS '23 Spotlight] Thought Cloning: Learning to Think while Acting by Imitating Human Thinking☆268Jun 28, 2024Updated last year
- Repo for Paper "OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft"☆28Feb 5, 2026Updated last month
- Code for "Learning to Model the World with Language." ICML 2024 Oral.☆414Jan 7, 2026Updated 2 months ago
- BASALT Benchmark datasets, evaluation code and agent training example.☆22Nov 29, 2023Updated 2 years ago
- [ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment☆41Dec 27, 2023Updated 2 years ago
- [ICCV'23] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models☆218Mar 26, 2025Updated 11 months ago