declare-lab / Emma-XLinks

Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning

☆78

Alternatives and similar repositories for Emma-X

Users that are interested in Emma-X are comparing it to the libraries listed below

Sorting:

liufanfanlff / RoboUniview
☆61Updated 9 months ago
aiming-lab / GRAPE
GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization
☆152Updated 8 months ago
hume-vla / hume
🦾 A Dual-System VLA with System2 Thinking
☆122Updated 3 months ago
Dantong88 / LLARVA
☆60Updated 11 months ago
pickxiguapi / Embodied-R1
Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"
☆108Updated 3 months ago
rainbow979 / robodreamer
☆87Updated last year
OpenHelix-Team / VLA-RFT
VLA-RFT: Vision-Language-Action Models with Reinforcement Fine-Tuning
☆96Updated 2 months ago
InternRobotics / InstructVLA
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
☆70Updated 2 months ago
RoboDita / Dita
ICCV2025
☆143Updated 3 weeks ago
InternRobotics / F1-VLA
F1: A Vision Language Action Model Bridging Understanding and Generation to Actions
☆137Updated last month
Fanqi-Lin / OneTwoVLA
Official implementation of "OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning"
☆200Updated 6 months ago
OpenDriveLab / CLOVER
[NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
☆130Updated 3 months ago
Zhoues / RoboRefer
[NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"
☆205Updated last month
GR1-Manipulation / GR-1
Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"
☆44Updated last year
Max-Fu / otter
[ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
☆111Updated 7 months ago
lmzpai / roboMamba
The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`
☆143Updated 11 months ago
JayceWen / tinyvla
☆67Updated 9 months ago
TencentARC / Moto
[ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
☆150Updated 2 months ago
SiyuanHuang95 / ManipVQA
[IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
☆98Updated last year
InternRobotics / InternVLA-A1
InternVLA-A1: Unifying Understanding, Generation, and Action for Robotic Manipulation
☆54Updated 2 months ago
yueyang130 / DeeR-VLA
Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"
☆119Updated 9 months ago
UMass-Embodied-AGI / MultiPLY
Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
☆134Updated last year
google-deepmind / robovqa
☆33Updated last year
EmbodiedAI-RoboTron / RoboTron-Mani
☆98Updated last month
changhaonan / A3VLM
[CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`
☆121Updated last year
Hoyyyaard / 3DFlowAction
☆41Updated 5 months ago
EmbodiedGPT / EgoCOT_Dataset
☆54Updated last year
thunlp / EmbodiedEval
Evaluate Multimodal LLMs as Embodied Agents
☆54Updated 9 months ago
OpenGVLab / VeBrain
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces
☆87Updated 6 months ago
Zhangwenyao1 / DreamVLA
[NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
☆245Updated 2 months ago