guanghuixu / CRN_tvqa
☆15Updated 3 years ago
Related projects: ⓘ
- ☆32Updated 3 years ago
- Improving One-stage Visual Grounding by Recursive Sub-query Construction, ECCV 2020☆80Updated 2 years ago
- Human-like Controllable Image Captioning with Verb-specific Semantic Roles.☆36Updated 2 years ago
- The official PyTorch code for "Relation-aware Instance Refinement for Weakly Supervised Visual Grounding" accepted by CVPR2021☆26Updated 2 years ago
- Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.☆62Updated 3 years ago
- A pytorch implementation of Attention Is All You Need (Transformer) for image captioning.☆12Updated 2 years ago
- The imdb files with SBD-Trans OCR for TextVQA dataset.☆10Updated 2 years ago
- An unofficial pytorch implementation of "TransVG: End-to-End Visual Grounding with Transformers".☆51Updated 3 years ago
- Weakly Supervised Grounding for VQA in Vision-Language Transformers☆16Updated last year
- Compact Trilinear Interaction for Visual Question Answering (ICCV 2019)☆38Updated last year
- Microsoft COCO Caption Evaluation Tool - Python 3☆30Updated 5 years ago
- Learning phrase grounding from captioned images through InfoNCE bound on mutual information☆72Updated 4 years ago
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"