michelecafagna26 / faster-rcnn-bottom-up-py
Extract features and bounding boxes using the original Bottom-up Attention Faster-RCNN in a few lines of Python code
☆9Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for faster-rcnn-bottom-up-py
- GraphVQA: Language-Guided Graph Neural Networks for Scene Graph Question Answering☆56Updated 3 years ago
- Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for …☆60Updated 2 years ago
- Code for WACV 2023 paper "VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge"☆21Updated last year
- MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering☆88Updated last year
- Coarse-to-Fine Reasoning for Visual Question Answering (CVPRW'22)☆43Updated 2 years ago
- A simplified pytorch version of densecap☆39Updated last year
- natual language guided image captioning☆77Updated 9 months ago
- ☆100Updated 2 years ago
- An PyTorch reimplementation of bottom-up-attention models☆16Updated 3 years ago
- Human-like Controllable Image Captioning with Verb-specific Semantic Roles.☆36Updated 2 years ago
- Video Graph Transformer for Video Question Answering (ECCV'22)☆44Updated last year
- ☆11Updated 2 years ago
- ☆26Updated last year
- Code accompanying paper "Fine-Grained Visual Entailment" [ECCV 2022].☆10Updated 2 years ago
- An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)☆84Updated 2 years ago
- PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)☆22Updated 2 years ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)☆129Updated 3 months ago
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆184Updated last year
- Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]☆67Updated 5 months ago
- Official Implementation for CVPR 2023 paper "Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasonin…☆10Updated 4 months ago
- ☆32Updated 3 years ago
- [Paper][IJCKG 2022] LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection☆25Updated 9 months ago
- Official repository for the A-OKVQA dataset☆63Updated 6 months ago
- A reading list of papers about Visual Question Answering.☆32Updated 2 years ago
- [CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding☆144Updated 4 months ago
- [CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias☆116Updated 2 years ago
- Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)☆119Updated last year
- A collections of papers about VQA-CP datasets and their results☆38Updated 2 years ago
- [EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning☆87Updated 3 months ago
- Source code for EMNLP 2022 paper “PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models”☆47Updated 2 years ago