RachanaJayaram / Cross-Attention-VizWiz-VQA
A self-evident application of the VQA task is to design systems that aid blind people with sight reliant queries. The VizWiz VQA dataset originates from images and questions compiled by members of the visually impaired community and as such, highlights some of the challenges presented by this particular use case.
☆14Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for Cross-Attention-VizWiz-VQA
- ☆39Updated last year
- Code of Dense Relational Captioning☆67Updated last year
- The source code of ACL 2020 paper: "Cross-Modality Relevance for Reasoning on Language and Vision"☆26Updated 3 years ago
- Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" (NeurIPS 2019)☆64Updated 4 years ago
- Microsoft COCO Caption Evaluation Tool - Python 3☆33Updated 5 years ago
- ☆44Updated 2 years ago
- ROCK model for Knowledge-Based VQA in Videos☆30Updated 4 years ago
- Implementation of paper "Improving Image Captioning with Better Use of Caption"☆32Updated 4 years ago
- code for paper `MemCap: Memorizing Style Knowledge for Image Captioning`☆11Updated 4 years ago
- ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration☆56Updated last year
- A reading list of papers about Visual Question Answering.☆32Updated 2 years ago
- Compact Trilinear Interaction for Visual Question Answering (ICCV 2019)☆38Updated 2 years ago
- Counterfactual Samples Synthesizing for Robust VQA☆76Updated 2 years ago
- Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020☆81Updated 4 years ago
- Human-like Controllable Image Captioning with Verb-specific Semantic Roles.☆36Updated 2 years ago
- [CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning☆91Updated 7 months ago
- A pytorch implementation of "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering" for image captioning.☆47Updated 3 years ago
- ☆62Updated 2 years ago
- An image-oriented evaluation tool for image captioning systems (EMNLP-IJCNLP 2019)☆34Updated 4 years ago
- [EMNLP 2018] Training for Diversity in Image Paragraph Captioning☆90Updated 5 years ago
- Code for our ACL2021 paper: "Check It Again: Progressive Visual Question Answering via Visual Entailment"☆31Updated 3 years ago
- A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning☆24Updated 4 years ago
- PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)☆22Updated 2 years ago
- Code for our IJCAI2020 paper: Overcoming Language Priors with Self-supervised Learning for Visual Question Answering☆48Updated 4 years ago
- Adversarial Inference for Multi-Sentence Video Descriptions (CVPR 2019)☆34Updated 5 years ago
- Subjective Image Captioning using Capsule Generative Adversarial Network☆12Updated 3 years ago
- MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering☆88Updated last year
- Implementation for MAF: Multimodal Alignment Framework☆43Updated 4 years ago
- Unpaired Image Captioning☆35Updated 3 years ago
- Code for ACL 2020 paper "Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA." Hyounghun Kim, Zineng T…☆34Updated 4 years ago