microsoft/VITRA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/VITRA)

microsoft / VITRA

[ICRA 2026] VITRA: Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos

☆448

Alternatives and similar repositories for VITRA

Users that are interested in VITRA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

apple / ml-egodex
View on GitHub
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
☆345Aug 20, 2025Updated 11 months ago
RchalYang / EgoVLA_Release
View on GitHub
☆172Dec 4, 2025Updated 7 months ago
ThunderVVV / HaWoR
View on GitHub
HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos
☆305Apr 16, 2026Updated 3 months ago
quincy-u / Ego_Humanoid_Manipulation_Benchmark
View on GitHub
egocentric humanoid manipulation benchmark
☆89Dec 4, 2025Updated 7 months ago
GaTech-RL2 / EgoVerse
View on GitHub
EgoVerse: Egocentric Data for Robot Learning from Around the World
☆480Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
BeingBeyond / Being-H
View on GitHub
Being-H is BeingBeyond's family of human-centric embodied foundation models.
☆1,104Jun 16, 2026Updated last month
NVlabs / cosmos-policy
View on GitHub
Cosmos Policy
☆836Jan 23, 2026Updated 6 months ago
RogerQi / human-policy
View on GitHub
☆258May 12, 2025Updated last year
ManipTrans / ManipTrans
View on GitHub
[CVPR 2025] 🎉 Official repository of "ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning"
☆329Oct 10, 2025Updated 9 months ago
LatentActionPretraining / LAPA
View on GitHub
[ICLR 2025] LAPA: Latent Action Pretraining from Videos
☆561Jan 22, 2025Updated last year
jiangranlv / LDA-1B
View on GitHub
[RSS 2026] LDA-1B: Scaling Latent Dynamics Action Model via Universal Embodied Data Ingestion
☆285May 26, 2026Updated last month
thu-ml / Motus
View on GitHub
Official code of Motus: A Unified Latent Action World Model
☆1,214Jan 5, 2026Updated 6 months ago
starVLA / starVLA
View on GitHub
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
☆3,267Updated this week
tars-robotics / World-In-Your-Hands
View on GitHub
☆127May 10, 2026Updated 2 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
dreamzero0 / dreamzero
View on GitHub
Code to pretrain, fine-tune, and evaluate DreamZero and run sim & real-world evals
☆2,478Apr 19, 2026Updated 3 months ago
facebookresearch / dexwm
View on GitHub
Official code and data from DexWM ("World Models Can Leverage Human Videos for Dexterous Manipulation").
☆94Jun 23, 2026Updated last month
Robbyant / lingbot-va
View on GitHub
[RSS 2026] Causal video-action world model for generalist robot control
☆1,661Jul 9, 2026Updated 2 weeks ago
thu-ml / RDT2
View on GitHub
Official code of RDT 2
☆795Feb 7, 2026Updated 5 months ago
HongzheBi / H_RDT
View on GitHub
H-RDT: Human Manipulation Enhanced Bimanual Robotic Manipulation
☆154Dec 21, 2025Updated 7 months ago
X-Square-Robot / wall-x
View on GitHub
Building General-Purpose Robots Based on Embodied Foundation Model
☆1,189Updated this week
unidex-ai / UniDex
View on GitHub
[CVPR 2026] UniDex: A Robot Foundation Suite for Universal Dexterous Hand Control from Egocentric Human Videos
☆165Mar 31, 2026Updated 3 months ago
BeingBeyond / Being-H0
View on GitHub
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos (ICML 2026)
☆51May 4, 2026Updated 2 months ago
RoboTwin-Platform / RoboTwin
View on GitHub
RoboTwin 2.0 Offical Repo
☆2,625Jul 14, 2026Updated last week
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
XiongyiCai / Human0
View on GitHub
☆34Jan 17, 2026Updated 6 months ago
ShuangLI59 / unified_video_action
View on GitHub
Official PyTorch Implementation of Unified Video Action Model (RSS 2025)
☆400Jul 23, 2025Updated last year
PRIME-RL / SimpleVLA-RL
View on GitHub
[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
☆1,792Jan 6, 2026Updated 6 months ago
robocasa / robocasa
View on GitHub
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
☆1,567Jul 8, 2026Updated 2 weeks ago
Robert-gyj / Ctrl-World
View on GitHub
ICLR 2026 Paper: Ctrl-World
☆539Apr 8, 2026Updated 3 months ago
PKU-EPIC / GraspVLA
View on GitHub
[CoRL25] GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data
☆391Dec 29, 2025Updated 6 months ago
thu-ml / RoboticsDiffusionTransformer
View on GitHub
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
☆1,760Jan 21, 2026Updated 6 months ago
yuantianyuan01 / FastWAM
View on GitHub
Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?
☆1,205Apr 3, 2026Updated 3 months ago
roboterax / video-prediction-policy
View on GitHub
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations https://video-prediction-policy.github.io
☆408May 17, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
DAGroup-PKU / HumanNet
View on GitHub
HumanNet: Scaling Human-centric Video Learning to One Million Hours
☆277May 26, 2026Updated last month
MarionLepert / phantom
View on GitHub
☆106Aug 29, 2025Updated 10 months ago
OpenDriveLab / UniVLA
View on GitHub
[RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions
☆1,112Nov 19, 2025Updated 8 months ago
2toinf / X-VLA
View on GitHub
[ICLR 2026] The offical Implementation of "Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model"
☆694Jun 10, 2026Updated last month
NVIDIA / DreamDojo
View on GitHub
Official Codebase for "DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos" (ICML 2026)
☆1,007Mar 21, 2026Updated 4 months ago
NVIDIA / GR00T-Dreams
View on GitHub
DreamGen: Nvidia GEAR Lab's initiative to solve the robotics data problem using world models
☆591Oct 24, 2025Updated 9 months ago
robocasa / robocasa-gr1-tabletop-tasks
View on GitHub
Simulation benchmarks of GR1 Tabletop Tasks for GR00T N1
☆145Aug 9, 2025Updated 11 months ago