JiuTian-VL / Optimus-1View external linksLinks
[NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks
☆94Jun 17, 2025Updated 7 months ago
Alternatives and similar repositories for Optimus-1
Users that are interested in Optimus-1 are comparing it to the libraries listed below
Sorting:
- Paper List of Minecraft Agents☆55Aug 15, 2025Updated 6 months ago
- [CVPR 2025] Official Implementation for Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy☆22Jun 17, 2025Updated 7 months ago
- Official implementation of paper "ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting" (CVPR'25)☆46Apr 13, 2025Updated 10 months ago
- Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents☆56Dec 3, 2025Updated 2 months ago
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆18Oct 1, 2024Updated last year
- ☆18Jun 10, 2025Updated 8 months ago
- Official repo for StyleMe3D☆28Apr 22, 2025Updated 9 months ago
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆21Jan 29, 2025Updated last year
- We introduce ADAM, An emboDied causal Agent in Minecraft, that can autonomously navigate the open world, perceive multimodal contexts, le…☆26Apr 7, 2025Updated 10 months ago
- ☆67Mar 6, 2025Updated 11 months ago
- This is the official project of paper: Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conver…☆22Nov 18, 2024Updated last year
- [ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment☆41Dec 27, 2023Updated 2 years ago
- Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"☆28Jul 31, 2024Updated last year
- ☆14Jan 24, 2025Updated last year
- HiconAgent: History Context-aware Policy Optimization for GUI Agents☆22Updated this week
- [ICCV 2025 Highlight] Less is More: Empowering GUI Agent with Context-Aware Simplification☆41Jan 21, 2026Updated 3 weeks ago
- Code and Data for the paper "Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works".☆21Jul 24, 2024Updated last year
- ☆46Jun 11, 2025Updated 8 months ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Aug 20, 2025Updated 5 months ago
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆49Jan 30, 2026Updated 2 weeks ago
- [SIGIR 2025] This is the code repo for our SIGIR'25 paper: Enhancing the Patent Matching Capability of Large Language Models via Memory G…☆18Apr 22, 2025Updated 9 months ago
- Several agents that can play poker (using probability, monte carlo, etc.) and clustering to get the types of poker players.☆13Updated this week
- MAKGED is the first multi-agent framework for collaborative error detection in knowledge graphs.☆30Jul 20, 2025Updated 6 months ago
- This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Met…☆162Sep 3, 2024Updated last year
- [EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information☆12Oct 11, 2024Updated last year
- ☆15Apr 11, 2024Updated last year
- CoV: Chain-of-View Prompting for Spatial Reasoning☆50Jan 23, 2026Updated 3 weeks ago
- ☆35Jan 21, 2025Updated last year
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆66Jun 10, 2025Updated 8 months ago
- ☆20Aug 13, 2024Updated last year
- The official implementation of the paper "Affective Faces for Goal-Driven Dyadic Communication."☆15Jan 27, 2023Updated 3 years ago
- [ACL 2025] Can MLLMs Understand the Deep Implication Behind Chinese Images?☆20Oct 20, 2025Updated 3 months ago
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆21Jun 15, 2025Updated 8 months ago
- Repo for Paper "OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft"☆24Feb 5, 2026Updated last week
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆18May 23, 2025Updated 8 months ago
- Official repository of "Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning", ICCV 2021☆17Aug 4, 2023Updated 2 years ago
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆22Dec 8, 2024Updated last year
- [ICCV 2025] GUIOdyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUIOdyssey consists of 8,834 e…☆147Jan 3, 2026Updated last month
- Official Implementation of "JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse"☆128Aug 27, 2025Updated 5 months ago