manantomar / video-occupancy-models
☆10Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for video-occupancy-models
- [ICML 2022] Official PyTorch implementation of the paper "Unsupervised Image Representation Learning with Deep Latent Particles"☆26Updated last year
- Slot-TTA shows that test-time adaptation using slot-centric models can improve image segmentation on out-of-distribution examples.☆24Updated last year
- This repository is a collection of research papers on World Models.☆34Updated last year
- ☆32Updated 2 years ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆29Updated 4 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆31Updated this week
- Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?☆34Updated this week
- A paper list of world model☆25Updated 5 months ago
- ☆13Updated 4 months ago
- ☆26Updated this week
- Evaluating pre-trained navigation agents under corruptions☆28Updated 3 years ago
- ☆20Updated 3 weeks ago
- Agent-to-Sim Learning Interactive Behavior from Casual Videos.☆30Updated 3 weeks ago
- [CVPR 2024 Highlight] SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers☆52Updated 4 months ago
- Semantic-Aware Fine-Grained Correspondence, at ECCV 2022 (Oral)☆15Updated 2 years ago
- Source code release for "Leveraging Demonstrations with Latent Space Priors"☆39Updated last year
- ☆10Updated last year
- [ICCV 2023] Learning Fine-Grained Features for Pixel-wise Video Correspondences☆17Updated 8 months ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆59Updated last month
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆78Updated 9 months ago
- ☆21Updated 3 months ago
- Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"☆18Updated last year
- Transformer implementation for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"☆55Updated 3 weeks ago
- VQVAE for video prediction☆26Updated 2 years ago
- Clarity: A Minimalist Website Template for AI Research☆53Updated last week
- [ICLR 2022] RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning☆64Updated 2 years ago
- Code release for DriveGAN (CVPR 2021)☆93Updated 2 years ago
- ☆25Updated 4 months ago
- [ICML 2024] Compositional Image Decomposition with Diffusion Models☆39Updated 4 months ago