flow-diffusion / AVDCLinks
Official repository of Learning to Act from Actionless Videos through Dense Correspondences.
☆224Updated last year
Alternatives and similar repositories for AVDC
Users that are interested in AVDC are comparing it to the libraries listed below
Sorting:
- Code for subgoal synthesis via image editing☆142Updated last year
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)☆253Updated 2 weeks ago
- ☆203Updated last year
- Official codebase for "Any-point Trajectory Modeling for Policy Learning"☆240Updated last month
- [ICLR 2025] LAPA: Latent Action Pretraining from Videos☆341Updated 6 months ago
- ☆106Updated 3 weeks ago
- Official implementation of GR-MG☆85Updated 6 months ago
- [ICCV 2023] Official code repository for ARNOLD benchmark☆172Updated 4 months ago
- Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"☆270Updated last year
- [ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation☆211Updated 3 weeks ago
- Reimplementation of GR-1, a generalized policy for robotics manipulation.☆139Updated 11 months ago
- ☆53Updated 7 months ago
- [ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos☆120Updated 2 months ago
- Repository for "General Flow as Foundation Affordance for Scalable Robot Learning"☆60Updated 7 months ago
- [RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre…☆150Updated 9 months ago
- Single-file implementation to advance vision-language-action (VLA) models with reinforcement learning.☆190Updated 2 weeks ago
- Official code for "QueST: Self-Supervised Skill Abstractions for Continuous Control" [NeurIPS 2024]☆94Updated 8 months ago
- Embodied Reasoning Question Answer (ERQA) Benchmark☆191Updated 4 months ago
- code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation☆92Updated last year
- ICCV2025☆112Updated 2 weeks ago
- A unified architecture for multimodal multi-task robotic policy learning.☆160Updated last year
- Unfied World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets☆103Updated last week
- ☆44Updated last year
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.☆269Updated last month
- Code for Reinforcement Learning from Vision Language Foundation Model Feedback☆116Updated last year
- ☆119Updated 2 years ago
- A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks☆132Updated last week
- ☆72Updated 9 months ago
- [CoRL 2023 Oral] GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields☆135Updated last year
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆135Updated 4 months ago