MayankSingal / VQA-Transformer
Visual Question Answering through transformers.
☆13Updated 6 years ago
Alternatives and similar repositories for VQA-Transformer:
Users that are interested in VQA-Transformer are comparing it to the libraries listed below
- Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020☆81Updated 4 years ago
- Chinese Visual Question Answering 中文看图问答☆47Updated 7 years ago
- Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" (NeurIPS 2019)☆65Updated 4 years ago
- [EMNLP 2018] Training for Diversity in Image Paragraph Captioning☆89Updated 5 years ago
- Code for "simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions" (EMNLP 2018)☆36Updated 6 years ago
- Official code and dataset link for ''VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles''☆36Updated 3 years ago
- Image Captioning based on Bottom-Up and Top-Down Attention model☆102Updated 6 years ago
- An implementation that downstreams pre-trained V+L models to VQA tasks. Now support: VisualBERT, LXMERT, and UNITER☆163Updated 2 years ago
- A GCN based visual question generation model☆13Updated 5 years ago
- Code for GHA (ACCV2018)☆13Updated 6 years ago
- Code for paper "Adaptively Aligned Image Captioning via Adaptive Attention Time". NeurIPS 2019☆49Updated 5 years ago
- vist story telling evaluation tool☆21Updated last year
- Official code and data for EMNLP 2020 paper "Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attenti…☆21Updated 4 years ago
- ☆91Updated 2 years ago
- ☆62Updated 3 years ago
- This repository contains the Pytorch implementation for our SCAI (EMNLP-2018) submission "A Knowledge-Grounded Multimodal Search-Based Co…☆29Updated 4 years ago
- The source code of ACL 2020 paper: "Cross-Modality Relevance for Reasoning on Language and Vision"☆26Updated 3 years ago
- Codes of AAAI 2020 paper "What Makes A Good Story? Designing Composite Rewards for Visual Storytelling"☆26Updated 3 years ago
- Code of Dense Relational Captioning☆68Updated last year
- ☆44Updated 2 years ago
- code for fluency-guided cross-lingual image captioning☆30Updated 6 years ago
- Good News Everyone! - CVPR 2019☆128Updated 2 years ago
- PyTorch implementation of paper: "Self-critical Sequence Training for Image Captioning"☆24Updated last year
- Code associated with the "Natural Language Rationales with Full-Stack Visual Reasoning" EMNLP Findings 2020 paper☆24Updated 4 years ago
- PyTorch implementation of Chinese image captioning on AI_challenger dataset☆34Updated 5 years ago
- A self-evident application of the VQA task is to design systems that aid blind people with sight reliant queries. The VizWiz VQA dataset …☆15Updated last year
- BERT + Image Captioning☆132Updated 4 years ago
- An image-oriented evaluation tool for image captioning systems (EMNLP-IJCNLP 2019)☆36Updated 4 years ago
- PyTorch Implementation of Knowing When to Look: Adaptive Attention via a Visual Sentinal for Image Captioning☆84Updated 4 years ago
- Re-implementation for 'R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering'.☆12Updated 5 years ago