TencentARC / MotoLinks

[ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos

☆137

Alternatives and similar repositories for Moto

Users that are interested in Moto are comparing it to the libraries listed below

Sorting:

Dantong88 / LLARVA
☆59Updated 10 months ago
RoboDita / Dita
ICCV2025
☆135Updated 2 months ago
baaivision / UniVLA
Unified Vision-Language-Action Model
☆213Updated last week
BeingBeyond / Being-H0
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos
☆168Updated last month
thuml / iVideoGPT
Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223
☆153Updated last month
InternRobotics / InternVLA-M1
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
☆191Updated last week
bytedance / IRASim
☆121Updated 3 months ago
pickxiguapi / Embodied-R1
Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"
☆89Updated 2 months ago
HeegerGao / FLIP
Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks
☆74Updated 10 months ago
bytedance / GR-MG
Official implementation of GR-MG
☆89Updated 9 months ago
ShuangLI59 / unified_video_action
Official PyTorch Implementation of Unified Video Action Model (RSS 2025)
☆278Updated 3 months ago
aiming-lab / GRAPE
GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization
☆144Updated 6 months ago
Tengbo-Yu / AnyBimanual
[ICCV2025] AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation
☆90Updated 4 months ago
Max-Fu / otter
[ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
☆106Updated 6 months ago
flow-diffusion / AVDC
Official repository of Learning to Act from Actionless Videos through Dense Correspondences.
☆232Updated last year
Zhangwenyao1 / DreamVLA
[NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
☆203Updated last month
Little-Podi / AdaWorld
[ICML'25] The PyTorch implementation of paper: "AdaWorld: Learning Adaptable World Models with Latent Actions".
☆166Updated 4 months ago
rainbow979 / robodreamer
☆81Updated last year
Fanqi-Lin / OneTwoVLA
Official implementation of "OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning"
☆191Updated 4 months ago
Robert-gyj / Ctrl-World
Ctrl-World: A Controllable Generative World Model for Robot Manipualtion
☆67Updated this week
EDiRobotics / GR1-Training
Reimplementation of GR-1, a generalized policy for robotics manipulation.
☆143Updated last year
LatentActionPretraining / LAPA
[ICLR 2025] LAPA: Latent Action Pretraining from Videos
☆387Updated 9 months ago
Hoyyyaard / 3DFlowAction
☆37Updated 3 months ago
InternRobotics / Seer
[ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
☆247Updated 3 months ago
hume-vla / hume
🦾 A Dual-System VLA with System2 Thinking
☆114Updated 2 months ago
qizekun / SoFar
[NeurIPS 2025 Spotlight] SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
☆197Updated 3 months ago
OpenDriveLab / CLOVER
[NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
☆129Updated last month
Koorye / Inspire
Official implemetation of the paper "InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning"
☆43Updated 3 weeks ago
OpenMOSS / VLABench
Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.
☆310Updated 2 months ago
xiaoxiao0406 / VQ-VLA
The offical repo for paper "VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers" (ICCV 2025)
☆86Updated 2 months ago