[ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment
☆41Dec 27, 2023Updated 2 years ago
Alternatives and similar repositories for STEVE
Users that are interested in STEVE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2023] Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation☆24Aug 26, 2023Updated 2 years ago
- [AAAI 2024] UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning☆12Dec 10, 2023Updated 2 years ago
- [CVPR2024] This is the official implement of MP5☆107Jun 30, 2024Updated last year
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Oct 21, 2024Updated last year
- A Massive Multi-Discipline Lecture Understanding Benchmark☆34Apr 20, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- We introduce ADAM, An emboDied causal Agent in Minecraft, that can autonomously navigate the open world, perceive multimodal contexts, le…☆31Apr 7, 2025Updated last year
- [IROS'25 Oral & NeurIPSw'24] Official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simula…☆103Jun 16, 2025Updated last year
- The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]☆16Sep 12, 2025Updated 9 months ago
- [ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark☆144Jun 4, 2025Updated last year
- An experiment with movie scenes and contrastive learning☆11Feb 1, 2025Updated last year
- VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation☆20Jun 2, 2025Updated last year
- ☆41Jan 12, 2026Updated 5 months ago
- [CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding☆703Jan 29, 2025Updated last year
- [ACM MM 2023] PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation☆12Aug 28, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official code for CVPR2024 “VideoMAC: Video Masked Autoencoders Meet ConvNets”☆15May 12, 2026Updated last month
- An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models☆18Feb 27, 2025Updated last year
- 【CVPRW'23】First Place Solution to the CVPR'2023 AQTC Challenge☆15Jul 18, 2023Updated 2 years ago
- Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer☆16Nov 21, 2024Updated last year
- Foundation Model for MineDojo☆298Apr 2, 2023Updated 3 years ago
- Odyssey: Empowering Minecraft Agents with Open-World Skills☆391Oct 22, 2025Updated 7 months ago
- [ECCV 2022] AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant☆23Jan 30, 2026Updated 4 months ago
- [IJCAI 2024] Official implementation of the paper "Integrating View Conditions for Image Synthesis"☆25Aug 27, 2024Updated last year
- FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024☆22Dec 9, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆12Sep 11, 2023Updated 2 years ago
- ☆22Nov 9, 2023Updated 2 years ago
- Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memo…☆640Jun 5, 2023Updated 3 years ago
- Official code for the ICCV2023 paper ``One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training''☆20Aug 9, 2023Updated 2 years ago
- [NeurIPS DB 2025] IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering☆46Oct 15, 2025Updated 8 months ago
- A Novel Dataset and Baseline Method for Cross-View Multi-Object Tracking in DIVerse Open Scenes (IJCV 2024)☆100Nov 13, 2025Updated 7 months ago