noagarcia / ArtVQA
AQUA dataset and VIKING model for the task of Art Visual Question Answering
☆23Updated 3 years ago
Alternatives and similar repositories for ArtVQA:
Users that are interested in ArtVQA are comparing it to the libraries listed below
- Command-line tool for downloading and extending the RedCaps dataset.☆46Updated last year
- Code for the paper "Understanding and Evaluating Racial Biases in Image Captioning"☆11Updated 3 years ago
- source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT☆73Updated 2 years ago
- ☆34Updated last year
- This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in th…☆63Updated 2 years ago
- Official code for the paper "Contrast and Classify: Training Robust VQA Models" published at ICCV, 2021☆19Updated 3 years ago
- ☆73Updated 2 years ago
- [EMNLP 2021] Code and data for our paper "Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers…☆20Updated 3 years ago
- A dataset of crowdsourced ratings for machine-generated image captions☆35Updated 5 years ago
- Code for the Globetrotter project☆23Updated 2 years ago
- ☆50Updated 2 years ago
- [BMVC22] Official Implementation of ViCHA: "Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment"☆54Updated 2 years ago
- Data of ACL 2019 Paper "Expressing Visual Relationships via Language".☆62Updated 4 years ago
- kdexd/coco-caption@de6f385☆26Updated 4 years ago
- CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification - 4th Workshop on Computer Vision for Fashion, Art, and Design☆27Updated 2 years ago
- Code, data, models for the Sherlock corpus☆55Updated 2 years ago
- ☆32Updated 2 years ago
- A large-scale dataset for instance-level recognition for artworks is introduced.☆47Updated last year
- [CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning☆90Updated 9 months ago
- Multi-sense word embeddings from visual co-occurrences☆25Updated 5 years ago
- ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.☆84Updated last year
- VQA baseline with Conditional Batch Normalization☆15Updated 6 years ago
- Use CLIP to represent video for Retrieval Task☆69Updated 3 years ago
- Official MXNet implementation of "Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning" (CVPR 2020)☆78Updated 2 years ago
- This is the repo for Multi-level textual grounding☆33Updated 4 years ago
- ☆31Updated 6 years ago
- ☆46Updated 3 years ago
- Reliably download millions of images efficiently☆113Updated 3 years ago
- RareAct: A video dataset of unusual interactions☆32Updated 4 years ago
- ☆26Updated 3 years ago