Zhoues / MineDreamer
[NeurIPSw'24] This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control "
☆77Updated this week
Alternatives and similar repositories for MineDreamer:
Users that are interested in MineDreamer are comparing it to the libraries listed below
- [CVPR2024] This is the official implement of MP5☆93Updated 7 months ago
- [ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment☆35Updated last year
- ☆65Updated last month
- ☆44Updated last year
- ☆123Updated 6 months ago
- [NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding☆62Updated 3 weeks ago
- (VillagerAgent ACL 2024) A Graph based Minecraft multi agents framework☆41Updated this week
- HAZARD challenge☆27Updated 8 months ago
- ☆34Updated 3 weeks ago
- GROOT: Learning to Follow Instructions by Watching Gameplay Videos☆60Updated last year
- [NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"☆32Updated 2 months ago
- Official repository of S-Agents: Self-organizing Agents in Open-ended Environment☆21Updated 10 months ago
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆55Updated 2 weeks ago
- LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents (ICLR 2024)☆65Updated 5 months ago
- Official implementation of "Self-Improving Video Generation"☆58Updated last month
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.☆91Updated 2 weeks ago
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆57Updated 2 months ago
- ☆26Updated last week
- [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"☆104Updated last year
- Official Implementation of ReALFRED (ECCV'24)☆32Updated 3 months ago
- Latent Motion Token as the Bridging Language for Robot Manipulation☆67Updated last month
- ☆32Updated last month
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆58Updated 6 months ago
- ☆43Updated 9 months ago
- Official code for the paper: Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld☆52Updated 3 months ago
- Paper collections of the continuous effort start from World Models.☆164Updated 6 months ago
- SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enh…☆29Updated 5 months ago
- ☆35Updated last year
- Codebase for HiP☆88Updated last year
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World☆124Updated 3 months ago