thuml / iVideoGPT
Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223
☆70Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for iVideoGPT
- Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://ar…☆55Updated last month
- ☆77Updated 3 months ago
- Repository for "General Flow as Foundation Affordance for Scalable Robot Learning"☆37Updated 7 months ago
- ☆30Updated 3 weeks ago
- [ECCV 2024] 💐Official implementation of the paper "Diffusion Reward: Learning Rewards via Conditional Video Diffusion"☆78Updated 4 months ago
- ☆46Updated 2 months ago
- Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".☆24Updated last week
- Official repository of Learning to Act from Actionless Videos through Dense Correspondences.☆173Updated 6 months ago
- VP2 Benchmark (A Control-Centric Benchmark for Video Prediction, ICLR 2023)☆23Updated 10 months ago
- ☆68Updated 2 months ago
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆61Updated 5 months ago
- [CVPR'2024] "SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution"☆52Updated last month
- Codebase for HiP☆87Updated 11 months ago
- ☆80Updated last week
- ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation☆83Updated 4 months ago
- [ICML 2024] The offical Implementation of "DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning"☆68Updated last month
- ☆39Updated 2 weeks ago
- Official implementation of "Self-Improving Video Generation"☆50Updated last week
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World☆122Updated 3 weeks ago
- Pytorch implementation of "Genie: Generative Interactive Environments", Bruce et al. (2024).☆71Updated 3 months ago
- Official repository for "LIV: Language-Image Representations and Rewards for Robotic Control" (ICML 2023)☆86Updated last year
- ☆58Updated last month
- [RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre…☆69Updated last month
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆78Updated 10 months ago
- Code for subgoal synthesis via image editing☆113Updated last year
- [RSS 2024] Learning Manipulation by Predicting Interaction☆90Updated 3 months ago
- Dreamitate: Real-World Visuomotor Policy Learning via Video Generation (CoRL 2024)☆41Updated 4 months ago
- ☆110Updated last year
- Theia: Distilling Diverse Vision Foundation Models for Robot Learning☆179Updated last month
- [ICRA2023] Grounding Language with Visual Affordances over Unstructured Data☆36Updated last year