huiwon-jang / RSPLinks
Visual Representation Learning with Stochastic Frame Prediction (ICML 2024)
☆22Updated 9 months ago
Alternatives and similar repositories for RSP
Users that are interested in RSP are comparing it to the libraries listed below
Sorting:
- Code release for ICLR 2023 paper: SlotFormer on object-centric dynamics models☆111Updated last year
- Code for Stable Control Representations☆25Updated 4 months ago
- Official Code for Neural Systematic Binder☆33Updated 2 years ago
- VP2 Benchmark (A Control-Centric Benchmark for Video Prediction, ICLR 2023)☆27Updated 5 months ago
- ☆56Updated 2 years ago
- Masked World Models for Visual Control☆129Updated 2 years ago
- (CVPR 2025) A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning☆16Updated 5 months ago
- Official code for Slot-Transformer for Videos (STEVE)☆49Updated 2 years ago
- ☆80Updated 2 weeks ago
- ☆43Updated last year
- This repository is the official implementation of Improving Object-centric Learning With Query Optimization☆51Updated 2 years ago
- PyTorch implementation of the Hiveformer research paper☆49Updated 2 years ago
- NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"☆96Updated 3 months ago
- ☆55Updated 8 months ago
- ☆42Updated last year
- Code for the ICLR 2024 spotlight paper: "Learning to Act without Actions" (introducing Latent Action Policies)☆117Updated last year
- Personal Python toolbox☆16Updated last year
- ☆44Updated last year
- Official repository for "VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training"☆171Updated last year
- Object-Centric-Representation Library (OCRL): This repo is to explore OCR on various downstream tasks from supervised learning tasks to R…☆12Updated last year
- Official repository for "LIV: Language-Image Representations and Rewards for Robotic Control" (ICML 2023)☆118Updated last year
- An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodiment☆18Updated 7 months ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆57Updated 11 months ago
- [NeurIPS 2024] GenRL: Multimodal-foundation world models enable grounding language and video prompts into embodied domains, by turning th…☆79Updated 4 months ago
- [ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction☆102Updated 4 months ago
- [WIP] Code for LangToMo☆16Updated 2 months ago
- [ICRA2023] Grounding Language with Visual Affordances over Unstructured Data☆45Updated last year
- ☆39Updated 3 years ago
- ☆12Updated last year
- ☆72Updated 10 months ago