JiuTian-VL / Optimus-1
[NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks
☆24Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for Optimus-1
- The Official Code Repository for GUI-World.☆36Updated 3 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆37Updated 6 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆37Updated 3 weeks ago
- Official Repo for UGround☆93Updated this week
- ⛏💎 STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment☆28Updated 10 months ago
- ☆57Updated last month
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆38Updated 4 months ago
- A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆30Updated 3 weeks ago
- Enhancement in Multimodal Representation Learning.☆38Updated 7 months ago
- ☆26Updated this week
- ☆40Updated 10 months ago
- This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Co…☆71Updated 4 months ago
- Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs☆41Updated 4 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆56Updated 5 months ago
- The official implementation of the paper "Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction".☆32Updated 8 months ago
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆25Updated 3 months ago
- Official repository for paper "GTA: A Benchmark for General Tool Agents" (NeurIPS 2024 D&B Track)☆43Updated this week
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆62Updated 2 weeks ago
- ☆44Updated last month
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆55Updated 5 months ago
- Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image …☆51Updated 3 weeks ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆75Updated 3 weeks ago
- The Code Repo for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization☆93Updated 2 months ago
- ControlLLM: Augment Language Models with Tools by Searching on Graphs☆186Updated 3 months ago
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆121Updated last week
- Official implementation of ECCV24 paper: POA☆24Updated 3 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆94Updated 2 weeks ago
- ☆51Updated this week
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆18Updated last month
- From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging☆52Updated last month