JiuTian-VL/Optimus-1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JiuTian-VL/Optimus-1)

JiuTian-VL / Optimus-1

[NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

☆96

Alternatives and similar repositories for Optimus-1

Users that are interested in Optimus-1 are comparing it to the libraries listed below

Sorting:

lizaijing / Awesome-Minecraft-Agent
View on GitHub
Paper List of Minecraft Agents
☆56Updated this week
JiuTian-VL / Optimus-2
View on GitHub
[CVPR 2025] Official Implementation for Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy
☆23Jun 17, 2025Updated 8 months ago
CraftJarvis / ROCKET-1
View on GitHub
Official implementation of paper "ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting" (CVPR'25)
☆46Apr 13, 2025Updated 10 months ago
elated-sawyer / WALL-E
View on GitHub
Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
☆57Dec 3, 2025Updated 3 months ago
WangWenhao0716 / PDF-Embedding
View on GitHub
[NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"
☆18Oct 1, 2024Updated last year
marinero4972 / CyberV
View on GitHub
☆18Jun 10, 2025Updated 8 months ago
AIGCResearch / styleme3d
View on GitHub
Official repo for StyleMe3D
☆28Apr 22, 2025Updated 10 months ago
hulianyuyy / iLLaVA
View on GitHub
iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models
☆21Jan 29, 2025Updated last year
OpenCausaLab / ADAM
View on GitHub
We introduce ADAM, An emboDied causal Agent in Minecraft, that can autonomously navigate the open world, perceive multimodal contexts, le…
☆27Apr 7, 2025Updated 11 months ago
amazon-science / PAE
View on GitHub
☆68Mar 6, 2025Updated last year
nuochenpku / COMEDY
View on GitHub
This is the official project of paper: Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conver…
☆22Nov 18, 2024Updated last year
rese1f / STEVE
View on GitHub
[ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment
☆41Dec 27, 2023Updated 2 years ago
eric-ai-lab / Screen-Point-and-Read
View on GitHub
Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"
☆29Jul 31, 2024Updated last year
JiuTian-VL / HiconAgent
View on GitHub
[CVPR 2026] HiconAgent: History Context-aware Policy Optimization for GUI Agents
☆25Mar 2, 2026Updated last week
longrongyang / STGC
View on GitHub
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
☆12Feb 11, 2025Updated last year
David-Li0406 / SMoA
View on GitHub
☆14Jan 24, 2025Updated last year
JiuTian-VL / SimpAgent
View on GitHub
[ICCV 2025 Highlight] Less is More: Empowering GUI Agent with Context-Aware Simplification
☆42Jan 21, 2026Updated last month
Joanna0123 / character_profiling
View on GitHub
Code and Data for the paper "Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works".
☆21Jul 24, 2024Updated last year
thu-coai / SPaR
View on GitHub
☆46Jun 11, 2025Updated 8 months ago
tsinghua-fib-lab / SmartAgent
View on GitHub
The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".
☆27Aug 20, 2025Updated 6 months ago
huaixuheqing / VPPO-RL
View on GitHub
[ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"
☆49Jan 30, 2026Updated last month
NEUIR / MemGraph
View on GitHub
[SIGIR 2025] This is the code repo for our SIGIR'25 paper: Enhancing the Patent Matching Capability of Large Language Models via Memory G…
☆19Apr 22, 2025Updated 10 months ago
LucasColas / Poker-AI
View on GitHub
Several agents that can play poker (using probability, monte carlo, etc.) and clustering to get the types of poker players.
☆13Feb 11, 2026Updated 3 weeks ago
Hyu-Zhang / BiHGH
View on GitHub
[ACMMM 2022 Oral] Official Implementation for Bi-directional Heterogeneous Graph Hashing towards Efficient Outfit Recommendation
☆11Dec 12, 2022Updated 3 years ago
ElevenLiy / MAKGED
View on GitHub
MAKGED is the first multi-agent framework for collaborative error detection in knowledge graphs.
☆30Jul 20, 2025Updated 7 months ago
bingreeky / MemEvolve
View on GitHub
MemEvolve & EvolveLab
☆182Dec 23, 2025Updated 2 months ago
xiaojieli0903 / MaskAgain
View on GitHub
Official repository of the “Mask Again: Masked Knowledge Distillation for Masked Video Modeling” (ACM MM 2023)
☆27Jul 11, 2024Updated last year
BAAI-Agents / GPA-LM
View on GitHub
This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Met…
☆162Sep 3, 2024Updated last year
Adaxry / Unified_Layer_Skipping
View on GitHub
☆15Apr 11, 2024Updated last year
ziplab / CoV
View on GitHub
CoV: Chain-of-View Prompting for Spatial Reasoning
☆51Jan 23, 2026Updated last month
GasolSun36 / SURf
View on GitHub
[EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information
☆12Oct 11, 2024Updated last year
MetabrainAGI / Awaker2.5-VL
View on GitHub
☆35Jan 21, 2025Updated last year
thunlp / DeepPerception
View on GitHub
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
☆66Jun 10, 2025Updated 8 months ago
hanbyel0105 / CamDistHumanPose3D
View on GitHub
Official repository of "Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning", ICCV 2021
☆17Aug 4, 2023Updated 2 years ago
webis-de / set-encoder
View on GitHub
Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders
☆18May 23, 2025Updated 9 months ago
bryanchrist / MathNeuro
View on GitHub
Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes
☆21Jun 15, 2025Updated 8 months ago
CraftJarvis / OpenHA
View on GitHub
Repo for Paper "OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft"
☆24Feb 5, 2026Updated last month
scottgeng00 / realtalk
View on GitHub
The official implementation of the paper "Affective Faces for Goal-Driven Dyadic Communication."
☆15Jan 27, 2023Updated 3 years ago
MING-ZCH / CII-Bench
View on GitHub
[ACL 2025] Can MLLMs Understand the Deep Implication Behind Chinese Images?
☆20Oct 20, 2025Updated 4 months ago