SJTU-DENG-Lab/Mantis

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SJTU-DENG-Lab/Mantis)

SJTU-DENG-Lab / Mantis

[CVPR 2026] Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight

☆92

Alternatives and similar repositories for Mantis

Users that are interested in Mantis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SJTU-DENG-Lab / UniCMs
View on GitHub
☆39May 20, 2025Updated last year
SJTU-DENG-Lab / WLA
View on GitHub
The official implementation of World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis
☆122Jun 18, 2026Updated last month
SJTU-DENG-Lab / LoPA
View on GitHub
LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding
☆39Apr 25, 2026Updated 2 months ago
SJTU-DENG-Lab / LightningRL
View on GitHub
LightningRL: Breaking the Accuracy–Parallelism Trade-off of Block-wise dLLMs via Reinforcement Learning
☆30Apr 25, 2026Updated 2 months ago
SJTU-DENG-Lab / LatentUM
View on GitHub
☆56Apr 9, 2026Updated 3 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
SJTU-DENG-Lab / SIFT
View on GitHub
SIFT: Grounding LLM Reasoning in Contexts via Stickers
☆57Mar 6, 2025Updated last year
Zhangwenyao1 / DreamVLA
View on GitHub
[NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
☆362Jan 6, 2026Updated 6 months ago
SJTU-DENG-Lab / AdaMoE
View on GitHub
[Findings of EMNLP 2024] AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models
☆20Oct 2, 2024Updated last year
SJTU-DENG-Lab / Orthogonal-Neural-operator
View on GitHub
Code for orthogonal neural operator
☆17Oct 15, 2023Updated 2 years ago
SJTU-DENG-Lab / Orthus
View on GitHub
☆89May 15, 2025Updated last year
SJTU-DENG-Lab / Diffulex
View on GitHub
Flexible and Pluggable Serving Engine for Diffusion LLMs
☆147Jul 13, 2026Updated last week
InternRobotics / Seer
View on GitHub
[ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
☆311Jul 8, 2025Updated last year
hao-ai-lab / JacobiForcing
View on GitHub
[ICML 2026] Jacobi Forcing: Fast and Accurate Diffusion-style Decoding
☆122Feb 20, 2026Updated 5 months ago
SJTU-DENG-Lab / Discrete-Diffusion-Forcing
View on GitHub
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
☆261Feb 3, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
clamrobot / clam
View on GitHub
☆16Updated this week
OpenHelix-Team / HiF-VLA
View on GitHub
[CVPR 2026] HiF-VLA: An efficient, bidirectional spatiotemporal expansion Vision-Language-Action Model
☆75Mar 11, 2026Updated 4 months ago
InternRobotics / F1-VLA
View on GitHub
F1: A Vision Language Action Model Bridging Understanding and Generation to Actions
☆200Jan 2, 2026Updated 6 months ago
PierreMarza / autonerf
View on GitHub
Code for IROS 2024 paper "AutoNeRF: Training Implicit Scene Representations with Autonomous Agents"
☆17Oct 24, 2024Updated last year
yueyang130 / DeeR-VLA
View on GitHub
Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"
☆128Feb 14, 2025Updated last year
moojink / rlds_dataset_mod
View on GitHub
Efficiently apply modification functions to RLDS/TFDS datasets.
☆32Jun 19, 2024Updated 2 years ago
InternRobotics / InstructVLA
View on GitHub
[ICLR 2026] InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
☆116Jan 27, 2026Updated 5 months ago
ZGC-EmbodyAI / LangForce
View on GitHub
[ICML 2026] This repo is the official implementation of "LangForce : Bayesian Decomposition of Vision Language Action Models via Latent …
☆72Jun 16, 2026Updated last month
BrunoFANG1 / openpi_subtask_generation
View on GitHub
☆26Oct 11, 2025Updated 9 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
univtac / UniVTAC
View on GitHub
☆118Jun 20, 2026Updated last month
SJTU-DENG-Lab / R1-Zero-VSI
View on GitHub
☆42Jun 9, 2025Updated last year
roboterax / video-prediction-policy
View on GitHub
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations https://video-prediction-policy.github.io
☆408May 17, 2025Updated last year
OpenHelix-Team / Spatial-Forcing
View on GitHub
Official implementation of Spatial-Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model [ICLR2026]
☆266Jul 7, 2026Updated 2 weeks ago
HHYHRHY / MM-ACT
View on GitHub
[CVPR'2026] "MM-ACT: Learn from Multimodal Parallel Generation to Act"
☆117Mar 13, 2026Updated 4 months ago
RchalYang / EgoVLA_Release
View on GitHub
☆172Dec 4, 2025Updated 7 months ago
baaivision / UniVLA
View on GitHub
[ICLR 2026] Unified Vision-Language-Action Model
☆314Oct 15, 2025Updated 9 months ago
LatentActionPretraining / LAPA
View on GitHub
[ICLR 2025] LAPA: Latent Action Pretraining from Videos
☆561Jan 22, 2025Updated last year
zhihou7 / dit_policy_vla
View on GitHub
☆16Mar 26, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Selen-Suyue / WoG
View on GitHub
[ICML 2026] 🏂 World Guidance: World Modeling in Condition Space for Action Generation
☆159Apr 28, 2026Updated 2 months ago
mit-han-lab / vlash
View on GitHub
Real-Time VLAs via Future-state-aware Asynchronous Inference.
☆438Apr 22, 2026Updated 3 months ago
yuantianyuan01 / FastWAM
View on GitHub
Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?
☆1,201Apr 3, 2026Updated 3 months ago
InternRobotics / InternVLA-A-series
View on GitHub
InternVLA-A1: Unifying Understanding, Generation, and Action for Robotic Manipulation
☆508Updated this week
OpenDriveLab / UniVLA
View on GitHub
[RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions
☆1,112Nov 19, 2025Updated 8 months ago
facebookresearch / egoman
View on GitHub
The repository provides code for EgoMAN model and dataset creation scripts.
☆32Dec 31, 2025Updated 6 months ago
starVLA / starVLA
View on GitHub
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
☆3,257Updated this week