zhijie-group / MantisLinks

The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight

☆72

Alternatives and similar repositories for Mantis

Users that are interested in Mantis are comparing it to the libraries listed below

Sorting:

OpenHelix-Team / CEED-VLA
Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.
☆46Updated 3 months ago
WM-PO / WMPO
Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models
☆96Updated this week
GigaAI-research / VLA-R1
☆60Updated last month
OpenHelix-Team / VLA-RFT
VLA-RFT: Vision-Language-Action Models with Reinforcement Fine-Tuning
☆113Updated 3 months ago
Biscue5 / EgoScaler
[CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision
☆32Updated last month
OpenGVLab / VeBrain
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces
☆87Updated 7 months ago
ustcwhy / BitVLA
Official implementation for BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation
☆100Updated 5 months ago
yliu-cs / SSR
[NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
☆38Updated 2 months ago
UMass-Embodied-AGI / MindJourney
[NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"
☆121Updated 2 months ago
thuml / RLVR-World
Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934
☆177Updated 2 months ago
InternRobotics / InstructVLA
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
☆90Updated 3 months ago
hume-vla / hume
🦾 A Dual-System VLA with System2 Thinking
☆129Updated 4 months ago
shengliangd / StereoVLA
StereoVLA is powered by stereo vision and supports flexible deployment with high tolerance to camera pose variations.
☆25Updated last week
metadriverse / urban-sim
[CVPR 2025 Highlight] Towards Autonomous Micromobility through Scalable Urban Simulation
☆153Updated 2 months ago
World-In-World / world-in-world
Code implementation of the paper "World-in-World: World Models in a Closed-Loop World"
☆121Updated 2 weeks ago
SHAILAB-IPEC / EO1
EO: Open-source Unified Embodied Foundation Model Series
☆34Updated 2 weeks ago
xiaoxiao0406 / VQ-VLA
The offical repo for paper "VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers" (ICCV 2025)
☆107Updated last month
InternRobotics / OST-Bench
[NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
☆69Updated 3 months ago
OpenHelix-Team / LLaVA-VLA
LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [Actively Maintained🔥]
☆173Updated 2 months ago
Zhoues / RoboTracer
Official implementation of "RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics"
☆43Updated this week
pickxiguapi / Embodied-R1
Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"
☆115Updated 4 months ago
buoyancy99 / large-video-planner
☆113Updated this week
InternRobotics / F1-VLA
F1: A Vision Language Action Model Bridging Understanding and Generation to Actions
☆153Updated last week
zhijie-group / R1-Zero-VSI
☆42Updated 7 months ago
Universal-Control / ppt_learning
A unified robotic manipulation learning framework
☆21Updated 4 months ago
Zhangwenyao1 / DreamVLA
[NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
☆265Updated 3 months ago
ByteDance-Seed / Chain-of-Action
Official implementation of Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation. Accepted in NeurIPS 2025.
☆92Updated 3 weeks ago
liufanfanlff / RoboUniview
☆63Updated 10 months ago
Zhoues / RoboRefer
[NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"
☆218Updated 3 weeks ago
DreamLM / Dream-VLX
Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.
☆83Updated this week