google-research-datasets / 2.5vrd
This dataset contains about 110k images annotated with the depth and occlusion relationships between arbitrary objects. It enables research on the 2.5D Visual Relationship Detection (2.5VRD) introduced in https://arxiv.org/abs/2104.12727.
☆15Updated 3 years ago
Related projects: ⓘ
- A simple but well-performing "single-hop" visual attention model for the GQA dataset☆20Updated 5 years ago
- Code for reproducing experiments in "How Useful is Self-Supervised Pretraining for Visual Tasks?"☆60Updated last month
- The project is about predicting sets (of classes) from images.☆22Updated 3 years ago
- [ICLR2024] (EvALign-ICL Benchmark) Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context …☆20Updated 6 months ago
- ☆11Updated 4 years ago
- ☆31Updated this week
- ☆16Updated 2 years ago
- Robust Contrastive Learning Using Negative Samples with Diminished Semantics (NeurIPS 2021)☆39Updated 2 years ago
- Code for SelfAugment☆27Updated 3 years ago
- Rethinking Nearest Neighbors for Visual Classification☆31Updated 2 years ago
- We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances…☆44Updated 3 years ago
- MLPs for Vision and Langauge Modeling (Coming Soon)☆27Updated 2 years ago
- ☆25Updated 3 years ago
- Data of ACL 2019 Paper "Expressing Visual Relationships via Language".☆62Updated 3 years ago
- dataset cleansing for Visual Genome☆30Updated 7 years ago
- ☆74Updated 2 years ago
- ☆25Updated 4 years ago
- ☆32Updated 2 years ago
- Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"☆18Updated last year
- Code and data for the project "Visually grounded continual learning of compositional semantics"☆21Updated last year
- ☆42Updated 3 years ago
- VQA baseline with Conditional Batch Normalization☆15Updated 6 years ago
- RareAct: A video dataset of unusual interactions☆32Updated 4 years ago
- A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering☆40Updated 3 years ago
- ☆34Updated 5 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆34Updated 2 years ago
- Command-line tool for downloading and extending the RedCaps dataset.☆45Updated 9 months ago
- Code for the Globetrotter project☆23Updated 2 years ago
- (ICML 2021) Implementation for S2SD - Simultaneous Similarity-based Self-Distillation for Deep Metric Learning. Paper Link: https://arxiv…☆41Updated 4 years ago
- ☆11Updated 7 years ago