bdaiinstitute / theiaLinks
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
☆253Updated 6 months ago
Alternatives and similar repositories for theia
Users that are interested in theia are comparing it to the libraries listed below
Sorting:
- A Vision-Language Model for Spatial Affordance Prediction in Robotics☆195Updated 3 months ago
- [ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policy☆225Updated 6 months ago
- Official Repository for MolmoAct☆212Updated last month
- [ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction☆106Updated 6 months ago
- Official Repository for SAM2Act☆205Updated last month
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)☆276Updated 2 months ago
- Code for subgoal synthesis via image editing☆143Updated last year
- OpenVLA: An open-source vision-language-action model for robotic manipulation.☆269Updated 7 months ago
- [CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`☆120Updated last year
- ☆283Updated 6 months ago
- [ICLR 2025] LAPA: Latent Action Pretraining from Videos☆385Updated 8 months ago
- Official repository of Learning to Act from Actionless Videos through Dense Correspondences.☆232Updated last year
- ☆267Updated last year
- ☆58Updated 10 months ago
- Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"☆283Updated last year
- Official codebase for "Any-point Trajectory Modeling for Policy Learning"☆253Updated 4 months ago
- ☆221Updated last year
- [RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre…☆156Updated last year
- Unfied World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets☆134Updated last week
- Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos☆166Updated last month
- A Benchmark for Evaluating Generalization for Robotic Manipulation☆138Updated 7 months ago
- Distributed Robot Interaction Dataset.☆254Updated last month
- Embodied Reasoning Question Answer (ERQA) Benchmark☆229Updated 7 months ago
- A unified architecture for multimodal multi-task robotic policy learning.☆167Updated last year
- ICCV2025☆135Updated last month
- VLAC: A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning☆185Updated 3 weeks ago
- DROID Policy Learning and Evaluation☆235Updated 5 months ago
- Official code for "Behavior Generation with Latent Actions" (ICML 2024 Spotlight)☆184Updated last year
- [ICRA 2025] In-Context Imitation Learning via Next-Token Prediction☆95Updated 7 months ago
- ☆122Updated 2 years ago