[CVPR2024] This is the official implement of MP5
☆108Jun 30, 2024Updated last year
Alternatives and similar repositories for MP5
Users that are interested in MP5 are comparing it to the libraries listed below
Sorting:
- ☆47Dec 11, 2023Updated 2 years ago
- [ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment☆41Dec 27, 2023Updated 2 years ago
- [World-Model-Survey-2024] Paper list and projects for World Model☆15Oct 31, 2024Updated last year
- ☆15Jun 6, 2024Updated last year
- JARVIS-1: Open-world Multi-task Agents with Memory-Augmented Multimodal Language Models☆389Apr 8, 2024Updated last year
- Text world based on Minecraft rules.☆17May 13, 2024Updated last year
- GROOT: Learning to Follow Instructions by Watching Gameplay Videos (ICLR'24, Spotlight)☆67Dec 18, 2023Updated 2 years ago
- STEVE-1: A Generative Model for Text-to-Behavior in Minecraft☆204Jun 4, 2024Updated last year
- [ICLR2024] (EvALign-ICL Benchmark) Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context …☆22Mar 1, 2024Updated 2 years ago
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents☆318Apr 16, 2024Updated last year
- ☆11Oct 25, 2024Updated last year
- Official implementation of the DECKARD Agent from the paper "Do Embodied Agents Dream of Pixelated Sheep?"☆94May 23, 2023Updated 2 years ago
- ☆12Nov 5, 2024Updated last year
- Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.☆26Aug 2, 2024Updated last year
- ☆30May 22, 2024Updated last year
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆54Jul 21, 2025Updated 7 months ago
- Responsible Robotic Manipulation☆16Aug 31, 2025Updated 6 months ago
- [ICCV 2025] RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints☆110Sep 2, 2025Updated 6 months ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆171Mar 8, 2025Updated 11 months ago
- This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Met…☆162Sep 3, 2024Updated last year
- Diagnostic Framework for LLMs and MLLMs☆31Feb 6, 2026Updated 3 weeks ago
- [EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information☆12Oct 11, 2024Updated last year
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆20May 15, 2025Updated 9 months ago
- Odyssey: Empowering Minecraft Agents with Open-World Skills☆365Oct 22, 2025Updated 4 months ago
- Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs☆106Sep 30, 2025Updated 5 months ago
- Official implementation of "RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics"☆63Jan 19, 2026Updated last month
- Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agen…☆290Aug 3, 2023Updated 2 years ago
- Soulstyler: Using Large Language Model to Guide Image Style Transfer for Target Object☆18Dec 1, 2024Updated last year
- This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥☆1,725Feb 12, 2026Updated 3 weeks ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆69May 31, 2024Updated last year
- Moatless Testbeds allows you to create isolated testbed environments in a Kubernetes cluster where you can apply code changes through git…☆14Apr 9, 2025Updated 10 months ago
- [TIP25] Code for "Text-Video Retrieval with Global-Local Semantic Consistent Learning"☆14May 12, 2025Updated 9 months ago
- [NeurIPS 2024] PyTorch code for the paper "Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning…☆23Oct 24, 2025Updated 4 months ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆106Mar 14, 2024Updated last year
- HAZARD challenge☆37Apr 27, 2025Updated 10 months ago
- Code implementation for paper "Can Large Language Models Empower Molecular Property Prediction?"☆39Jul 14, 2023Updated 2 years ago
- Official implementation of paper "ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting" (CVPR'25)☆46Apr 13, 2025Updated 10 months ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆81May 7, 2024Updated last year
- ☆19Aug 21, 2024Updated last year