OpenMOSS / RoboOmniLinks

Official code of "RoboOmni: Proactive Robot Manipulation in Omni-modal Context"

☆61

Alternatives and similar repositories for RoboOmni

Users that are interested in RoboOmni are comparing it to the libraries listed below

Sorting:

FlagOpen / ShareRobot
☆59Updated 7 months ago
FlagOpen / RoboBrain-X0
☆88Updated 3 weeks ago
OpenGVLab / VeBrain
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces
☆85Updated 5 months ago
thunlp / EmbodiedEval
Evaluate Multimodal LLMs as Embodied Agents
☆54Updated 9 months ago
zwq2018 / embodied_reasoner
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
☆179Updated last month
declare-lab / Emma-X
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning
☆76Updated 6 months ago
sjh0354 / World-Aware-Planning
☆18Updated 3 months ago
declare-lab / nora
NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
☆185Updated 3 months ago
cambridgeltl / topviewrs
TopViewRS: Vision-Language Models as Top-View Spatial Reasoners (EMNLP 2024 Oral)
☆15Updated 5 months ago
thuml / RLVR-World
Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934
☆135Updated 3 weeks ago
pkunlp-icler / PCA-EVAL
[ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
☆103Updated last year
EmbodiedGPT / EgoCOT_Dataset
☆54Updated last year
hume-vla / hume
🦾 A Dual-System VLA with System2 Thinking
☆115Updated 2 months ago
MARS-EAI / VIKI-R
[NeurIPS 2025] VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
☆56Updated 3 weeks ago
Gabesarch / ICAL
☆52Updated 6 months ago
ustcwhy / BitVLA
Official implementation for BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation
☆90Updated 4 months ago
XinrunXu / DeepPHY
DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning
☆162Updated this week
Gabesarch / HELPER
☆32Updated last year
IranQin / MP5
[CVPR2024] This is the official implement of MP5
☆106Updated last year
OpenHelix-Team / CEED-VLA
Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.
☆43Updated 2 months ago
AV-Odyssey / AV-Odyssey
This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"
☆30Updated 10 months ago
TeleHuman / Align-Then-Steer
Official Implementation of "Align-Then-stEer: Adapting the Vision-Language Action Models through Unified Latent Guidance".
☆33Updated last month
AGI-Edgerunners / IIL
Code for our Paper "All in an Aggregated Image for In-Image Learning"
☆29Updated last year
lmzpai / roboMamba
The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`
☆139Updated 10 months ago
Xiuyuan-Chen / AutoEval-Video
☆36Updated last year
tulerfeng / Awesome-Embodied-Multimodal-LLMs
Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).
☆121Updated last year
pointarena / pointarena
☆29Updated 2 months ago
yliu-cs / SSR
[NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
☆27Updated last month
DigiRL-agent / digiq
☆116Updated 7 months ago
EmbodiedBench / EmbodiedBench
[ICML 2025 Oral] Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents.
☆214Updated 3 weeks ago