IranQin / MP5
[CVPR2024] This is the official implement of MP5
☆84Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for MP5
- [NIPS24W]This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated…☆73Updated 4 months ago
- ⛏💎 STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment☆30Updated 10 months ago
- ☆61Updated last month
- ☆41Updated 7 months ago
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World☆122Updated 3 weeks ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆99Updated 8 months ago
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆61Updated 5 months ago
- [CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Lan…☆48Updated 2 weeks ago
- ☆101Updated 2 weeks ago
- ☆114Updated 4 months ago
- ☆25Updated last month
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆22Updated last week
- Official implementation of WebVLN: Vision-and-Language Navigation on Websites☆23Updated 10 months ago
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆75Updated 2 months ago
- The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.☆163Updated last month
- ☆40Updated 11 months ago
- Code&Data for Grounded 3D-LLM with Referent Tokens☆89Updated last month
- Official code for the paper: Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld☆47Updated last month
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning☆204Updated last month
- [NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"☆29Updated this week
- ☆30Updated 3 weeks ago
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆70Updated 2 weeks ago
- [arXiv 2023] Embodied Task Planning with Large Language Models☆156Updated last year
- [CVPR 2024] The code for paper 'Towards Learning a Generalist Model for Embodied Navigation'☆122Updated 5 months ago
- Official repository of Learning to Act from Actionless Videos through Dense Correspondences.☆173Updated 6 months ago
- This repository compiles a list of papers related to the application of video technology in the field of robotics! Star⭐ the repo and fol…☆123Updated 3 months ago
- ☆46Updated 2 months ago
- MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning☆96Updated 6 months ago
- [ICLR 2024] Code for the paper "Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning"☆129Updated 3 weeks ago
- A minecraft multi agents framework☆35Updated this week