microsoft / VITRALinks
VITRA: Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
☆165Updated this week
Alternatives and similar repositories for VITRA
Users that are interested in VITRA are comparing it to the libraries listed below
Sorting:
- Official implementation of Spatial-Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model☆144Updated last week
- [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.☆200Updated last month
- [ECCV 2024] ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation☆253Updated 8 months ago
- Ctrl-World: A Controllable Generative World Model for Robot Manipualtion☆198Updated last week
- EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video☆82Updated 3 months ago
- ☆178Updated 4 months ago
- Sim-to-real and CDM inference code for ManipAsInSim project.☆123Updated this week
- Official Reporsitory of "EgoMono4D: Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos"☆39Updated 2 months ago
- Geometry-aware 4D Video Generation for Robot Manipulation☆66Updated 3 months ago
- Unfied World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets☆160Updated 2 months ago
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)☆303Updated 4 months ago
- [Nips 2025] EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆124Updated 4 months ago
- [NeurIPS 24] The implementation and dataset of LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and…☆56Updated 8 months ago
- [ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation☆169Updated 5 months ago
- ☆135Updated 5 months ago
- [RSS 2025] Novel Demonstration Generation with Gaussian Splatting Enables Robust One-Shot Manipulation☆155Updated 6 months ago
- [NeurIPS 2024 D&B] Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning☆89Updated last year
- ☆137Updated 7 months ago
- Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos☆186Updated 3 months ago
- ☆64Updated 4 months ago
- ☆111Updated 9 months ago
- Implementation of Prompting with the Future: Open-World Model Predictive Control with Interactive Digital Twins. [RSS 2025]☆45Updated last month
- CVPR 2025☆37Updated this week
- [ICCV2025] AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation☆93Updated 5 months ago
- [CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆170Updated 5 months ago
- List of papers on video-centric robot learning☆22Updated last year
- Code implementation of CVPR 2024 highlight paper "PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI"☆182Updated 6 months ago
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆103Updated last month
- Official implementation of "Re3Sim: Generating High-Fidelity Simulation Data via 3D-Photorealistic Real-to-Sim for Robotic Manipulation"☆126Updated 2 months ago
- Code for "Robot See Robot Do" presented at CoRL 2024!☆157Updated last year