XiaomiMiMo / MiMo-EmbodiedLinks
MiMo-Embodied
☆342Updated last month
Alternatives and similar repositories for MiMo-Embodied
Users that are interested in MiMo-Embodied are comparing it to the libraries listed below
Sorting:
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆185Updated 3 months ago
- RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation☆275Updated last month
- Unified Vision-Language-Action Model☆257Updated 2 months ago
- LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [Actively Maintained🔥]☆173Updated 2 months ago
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934☆181Updated 2 months ago
- NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks☆200Updated last month
- ☆346Updated last week
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆122Updated last year
- RynnVLA-002: A Unified Vision-Language-Action and World Model☆818Updated last month
- ☆60Updated last month
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆164Updated 3 months ago
- InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy☆330Updated last week
- F1: A Vision Language Action Model Bridging Understanding and Generation to Actions☆153Updated last week
- Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces☆87Updated 7 months ago
- ☆102Updated 2 months ago
- [NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"☆218Updated 3 weeks ago
- The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.☆326Updated 3 months ago
- VLA-RFT: Vision-Language-Action Models with Reinforcement Fine-Tuning☆113Updated 3 months ago
- ☆60Updated 9 months ago
- [NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge☆273Updated this week
- Official code for EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models☆95Updated 6 months ago
- EO: Open-source Unified Embodied Foundation Model Series☆34Updated 2 weeks ago
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆198Updated 8 months ago
- Virtual Community: An Open World for Humans, Robots, and Society☆178Updated last week
- Official implementation for BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation☆100Updated 5 months ago
- [ICML'25] The PyTorch implementation of paper: "AdaWorld: Learning Adaptable World Models with Latent Actions".☆190Updated 6 months ago
- Galaxea's first VLA release☆471Updated this week
- InternVLA-A1: Unifying Understanding, Generation, and Action for Robotic Manipulation☆226Updated this week
- Nav-R1: Reasoning and Navigation in Embodied Scenes☆96Updated 2 months ago
- Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.☆274Updated this week