mugen-org / MUGEN_coinrun
A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect video data.
☆13Updated 2 years ago
Alternatives and similar repositories for MUGEN_coinrun:
Users that are interested in MUGEN_coinrun are comparing it to the libraries listed below
- multimodal video-audio-text generation and retrieval between every pair of modalities on the MUGEN dataset. The repo. contains the traini…☆39Updated last year
- This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes …☆84Updated 2 years ago
- Command-line tool for downloading and extending the RedCaps dataset.☆46Updated last year
- ☆41Updated last year
- We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances…☆47Updated 3 years ago
- Official PyTorch implementation of "Improving Generative Imagination in Object-Centric World Models"☆35Updated 2 years ago
- RareAct: A video dataset of unusual interactions☆32Updated 4 years ago
- Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch☆64Updated 2 years ago
- [NeurIPS 2021 Spotlight] Learning to Compose Visual Relations☆102Updated last year
- Release of ImageNet-Captions☆45Updated 2 years ago
- PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)☆33Updated 2 years ago
- ☆38Updated 2 years ago
- A new play-and-plug method of controlling an existing generative model with conditioning attributes and their compositions.☆72Updated 3 years ago
- ☆50Updated 2 years ago
- Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"☆139Updated 2 years ago
- DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)☆140Updated last year
- ☆36Updated last year
- ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.☆56Updated 3 years ago
- ☆53Updated 2 years ago
- ☆73Updated 2 years ago
- The official implementation of "Train Sparsely, Generate Densely: Memory-efficient Unsupervised Training of High-resolution Temporal GAN"☆81Updated 2 years ago
- [ICLR 2022] RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning☆63Updated 2 years ago
- [CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"☆52Updated last year
- Official code for our CVPR 2023 paper: Test of Time: Instilling Video-Language Models with a Sense of Time☆45Updated 9 months ago
- Un-*** 50 billions multimodality dataset☆24Updated 2 years ago
- https://arxiv.org/abs/2209.15162☆49Updated 2 years ago
- [NeurIPS 2021] Code for Unsupervised Learning of Compositional Energy Concepts☆59Updated 2 years ago
- Code for Learning to Learn Language from Narrated Video☆33Updated last year
- Latent Normalizing Flows for Many-to-Many Cross Domain Mappings (ICLR 2020)☆33Updated 2 years ago
- VQVAE for video prediction☆27Updated 2 years ago