RachanaJayaram / Cross-Attention-VizWiz-VQA
A self-evident application of the VQA task is to design systems that aid blind people with sight reliant queries. The VizWiz VQA dataset originates from images and questions compiled by members of the visually impaired community and as such, highlights some of the challenges presented by this particular use case.
☆15Updated last year
Alternatives and similar repositories for Cross-Attention-VizWiz-VQA:
Users that are interested in Cross-Attention-VizWiz-VQA are comparing it to the libraries listed below
- Code of Dense Relational Captioning☆69Updated 2 years ago
- The source code of ACL 2020 paper: "Cross-Modality Relevance for Reasoning on Language and Vision"☆27Updated 3 years ago
- ☆38Updated 2 years ago
- ☆67Updated 2 years ago
- An image-oriented evaluation tool for image captioning systems (EMNLP-IJCNLP 2019)☆38Updated 4 years ago
- Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020☆80Updated 4 years ago
- ROCK model for Knowledge-Based VQA in Videos☆31Updated 4 years ago
- Implementation for MAF: Multimodal Alignment Framework☆46Updated 4 years ago
- Microsoft COCO Caption Evaluation Tool - Python 3☆33Updated 5 years ago
- Implementation of paper "Improving Image Captioning with Better Use of Caption"☆32Updated 4 years ago
- Code for paper "Adaptively Aligned Image Captioning via Adaptive Attention Time". NeurIPS 2019☆50Updated 5 years ago
- ☆10Updated 7 years ago
- Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" (NeurIPS 2019)☆65Updated 4 years ago
- Code for ACL 2020 paper "Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA." Hyounghun Kim, Zineng T…☆34Updated 4 years ago
- ☆44Updated 2 years ago
- Code and Resources for the Transformer Encoder Reasoning Network (TERN) - https://arxiv.org/abs/2004.09144☆58Updated last year
- [EMNLP 2018] Training for Diversity in Image Paragraph Captioning☆89Updated 5 years ago
- CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering☆75Updated 5 years ago
- Code for "On diversity in image captioning: metrics and methods".☆8Updated 4 years ago
- Compact Trilinear Interaction for Visual Question Answering (ICCV 2019)☆38Updated 2 years ago
- An updated PyTorch implementation of hengyuan-hu's version for 'Bottom-Up and Top-Down Attention for Image Captioning and Visual Question…☆36Updated 3 years ago
- A pytorch implementation of "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering" for image captioning.