microsoft / VITRALinks
VITRA: Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
☆52Updated this week
Alternatives and similar repositories for VITRA
Users that are interested in VITRA are comparing it to the libraries listed below
Sorting:
- [Nips 2025] EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆121Updated 3 months ago
- [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.☆184Updated 2 weeks ago
- Official implementation of Spatial-Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model☆92Updated this week
- [ARXIV’25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control☆81Updated 3 months ago
- Cosmos-Transfer2.5, built on top of Cosmos-Predict2.5, produces high-quality world simulations conditioned on multiple spatial control in…☆105Updated last week
- Official Implementation of paper "Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence"☆135Updated 3 months ago
- Official Reporsitory of "EgoMono4D: Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos"☆36Updated last month
- A list of works on video generation towards world model☆170Updated 2 weeks ago
- CVPR 2025☆36Updated 6 months ago
- A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.☆197Updated 2 weeks ago
- Code implementation of CVPR 2024 highlight paper "PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI"☆175Updated 4 months ago
- Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the …☆244Updated this week
- https://coshand.cs.columbia.edu/☆16Updated last year
- CVPR2025 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation☆28Updated 2 months ago
- DELTA: Dense Efficient Long-range 3D Tracking for Any video (ICLR 2025)☆127Updated 6 months ago
- HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos☆102Updated 6 months ago
- Official implementation of the paper "PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios" (CVPR 2024).☆70Updated last year
- [NeurIPS 24] The implementation and dataset of LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and…☆56Updated 7 months ago
- [ICLR 2024 Spotlight] Unified Human-Scene Interaction via Prompted Chain-of-Contacts☆238Updated 3 months ago
- Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation☆159Updated 3 months ago
- Code for "Motion-2-to-3: Leveraging 2D Motion Data to Boost 3D Motion Generation", Arxiv 2024☆89Updated 2 weeks ago
- ☆63Updated 3 months ago
- ☆174Updated 3 months ago
- [AAAI 2025] DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors☆216Updated last year
- (CVPR 2025 Highlight) The Scene Language: Representing Scenes with Programs, Words, and Embeddings☆241Updated 3 months ago
- Official implementation of CVPR24 highlight paper "Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Sce…☆163Updated last year
- ☆50Updated 6 months ago
- OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling☆377Updated 2 weeks ago
- [ICLR 2025] SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects☆86Updated 6 months ago
- ☆79Updated 5 months ago