[CVPR 2025] Official Implementation for Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy
☆25Jun 17, 2025Updated 10 months ago
Alternatives and similar repositories for CVPR25-Optimus-2
Users that are interested in CVPR25-Optimus-2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆100Jun 17, 2025Updated 10 months ago
- Paper List of Minecraft Agents☆62Mar 6, 2026Updated last month
- Constructing community of LLM-based Agent in the minecraft☆17Nov 27, 2025Updated 5 months ago
- Official repository of the "Fine-grained Key-Value Memory Enhanced Predictor for Video Representation Learning" (ACM MM 2023)☆23Jul 11, 2024Updated last year
- We introduce ADAM, An emboDied causal Agent in Minecraft, that can autonomously navigate the open world, perceive multimodal contexts, le…☆28Apr 7, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [CVPR 2022 Oral] Faithful Extreme Rescaling via Generative Prior Reciprocated Invertible Representations☆13Jul 14, 2022Updated 3 years ago
- ☆17Sep 23, 2023Updated 2 years ago
- Code of the paper "Correctable Landmark Discovery via Large Models for Vision-Language Navigation" (TPAMI 2024)☆16Jun 7, 2024Updated last year
- Code of paper "HyperVLA: Efficient Inference in Vision-Language-Action Models via Hypernetworks"☆24Oct 8, 2025Updated 6 months ago
- GROOT: Learning to Follow Instructions by Watching Gameplay Videos (ICLR'24, Spotlight)☆67Dec 18, 2023Updated 2 years ago
- Detection and Reconstruction of Transparent Objects with Infrared Projection-based RGB-D Cameras☆13Jan 17, 2021Updated 5 years ago
- [ICCV 23]This is a Pytorch implementation of our paper "SMMix: Self-Motivated Image Mixing for Vision Transformers"☆16Jul 14, 2023Updated 2 years ago
- This project is a sample program of FBX SDK Python Bindings.☆17Nov 29, 2019Updated 6 years ago
- Synthetic Hypertext and Homomorphic Catalogue☆16Dec 28, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICCV 2025 Highlight] Less is More: Empowering GUI Agent with Context-Aware Simplification☆47Mar 12, 2026Updated last month
- CVPR 2026 - MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation☆46Mar 23, 2026Updated last month
- CVPR 2024 "Instance Tracking in 3D Scenes from Egocentric Videos"☆19Jun 27, 2024Updated last year
- Official repository of " SFTrack: A Robust Scale and Motion Adaptive Algorithm for Tracking Small and Fast Moving Objects" (IROS 2024)☆17Mar 9, 2025Updated last year
- detecting tennis court keypoints with yolo☆10Apr 19, 2026Updated last week
- ☆10May 5, 2024Updated last year
- Lossy Compression with Pretrained Diffusion Models☆35Dec 10, 2025Updated 4 months ago
- ☆11Jul 4, 2024Updated last year
- [ECCV] HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning☆26Sep 6, 2025Updated 7 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆13Apr 28, 2019Updated 7 years ago
- Repo for Paper "OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft"☆31Apr 2, 2026Updated 3 weeks ago
- ☆16Apr 14, 2026Updated 2 weeks ago
- [CVPRW 2025] Official repository of DTTDNet: Robust Digital-Twin Localization via An RGBD-based Transformer Network and A Comprehensive E…☆23Apr 9, 2026Updated 2 weeks ago
- ☆20Apr 14, 2023Updated 3 years ago
- Code for the C2KD paper (ICASSP 2023)☆19May 15, 2023Updated 2 years ago
- [CVPRW 2023] Official repository of "Digital Twin Tracking Dataset (DTTD): A New RGB+Depth 3D Dataset for Longer-Range Object Tracking Ap…☆24Nov 22, 2024Updated last year
- ☆28Feb 29, 2024Updated 2 years ago
- Implementation of "Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction"☆46Aug 15, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ICASSP'25] Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues☆17Dec 31, 2024Updated last year
- KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation☆22Apr 23, 2025Updated last year
- This is the official impletations of the EMNLP Findings paper, VideoINSTA: Zero-shot Long-Form Video Understanding via Informative Spatia…☆25Apr 7, 2026Updated 3 weeks ago
- Text world based on Minecraft rules.☆17May 13, 2024Updated last year
- [ICCV 2025] MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation☆51Oct 14, 2025Updated 6 months ago
- Build your rail application environment in a handy way☆12Dec 4, 2018Updated 7 years ago
- MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence☆57Mar 11, 2026Updated last month