Pytorch VQA : Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf)
☆99Aug 27, 2023Updated 2 years ago
Alternatives and similar repositories for basic_vqa
Users that are interested in basic_vqa are comparing it to the libraries listed below
Sorting:
- Visual Question Answering in PyTorch with various Attention Models☆20Mar 24, 2020Updated 5 years ago
- PyTorch VQA implementation that achieved top performances in the (ECCV18) VizWiz Grand Challenge: Answering Visual Questions from Blind P…☆63Oct 17, 2018Updated 7 years ago
- CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering☆77Jan 19, 2020Updated 6 years ago
- An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.☆765Mar 10, 2024Updated last year
- VQA - Visual Question Answering☆14Nov 13, 2016Updated 9 years ago
- [ECCV 2020] Temporal Aggregate Representations for Long-Range Video Understanding☆11Sep 13, 2021Updated 4 years ago
- Code for ACL 2020 paper "Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA." Hyounghun Kim, Zineng T…☆34May 14, 2020Updated 5 years ago
- PyTorch Implementation of VQA Baseline & Hierarchical Co-Attention model☆16Oct 3, 2023Updated 2 years ago
- CDER (Conversational Diarization Error Rate) Scoring Tool☆22Sep 13, 2022Updated 3 years ago
- ☆20Jul 27, 2020Updated 5 years ago
- Train a deeper LSTM and normalized CNN Visual Question Answering model. This current code can get 58.16 on OpenEnded and 63.09 on Multipl…☆388Mar 22, 2019Updated 6 years ago
- Official Implementation of "Probing Language Models for Pre-training Data Detection"☆20Dec 4, 2024Updated last year
- An introduction to global assessment techniques using Python☆12Apr 24, 2023Updated 2 years ago
- This repository gives a GUI using PyQt4 for VQA demo using Keras Deep Learning Library. The VQA model is created using Pre-trained VGG-1…☆46Jul 11, 2021Updated 4 years ago
- Bilinear attention networks for visual question answering☆548Oct 30, 2023Updated 2 years ago
- [ICLR 2026] The official repository for the paper "AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning".☆72Feb 27, 2026Updated last week
- A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Common…☆673Jul 6, 2023Updated 2 years ago
- BLOCK (AAAI 2019), with a multimodal fusion library for deep learning models☆356Dec 4, 2019Updated 6 years ago
- Using Vision Transformers for enhanced wildfire detection in satellite images☆30May 14, 2022Updated 3 years ago
- A pytroch reimplementation of "Bilinear Attention Network", "Intra- and Inter-modality Attention", "Learning Conditioned Graph Structures…☆297Jan 6, 2026Updated 2 months ago
- An implementation that downstreams pre-trained V+L models to VQA tasks. Now support: VisualBERT, LXMERT, and UNITER☆166Dec 11, 2022Updated 3 years ago
- Visual Question Answering Demo on pretrained model☆248Oct 31, 2025Updated 4 months ago
- Neural Module Network for VQA in Pytorch☆107Dec 16, 2017Updated 8 years ago
- ☆33Oct 2, 2020Updated 5 years ago
- Code repository corresponding to the paper "Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation" (NAACL 2024…☆10May 31, 2024Updated last year
- Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome☆1,467Feb 3, 2023Updated 3 years ago
- This is code for the EMNLP 2022 Paper "UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation".☆10Apr 30, 2023Updated 2 years ago
- ☆15Oct 24, 2023Updated 2 years ago
- PyTorch Implementation of Knowing When to Look: Adaptive Attention via a Visual Sentinal for Image Captioning☆88May 25, 2020Updated 5 years ago
- Quickdraw 10 classes classification with machine learning and serve as a website playground☆10Aug 12, 2017Updated 8 years ago
- Local implementation of Deforum Stable Diffusion V0.5. Supports math automation, perspective flips, prompt weights, video masking☆14Nov 6, 2022Updated 3 years ago
- This project is a PyTorch implementation of the paper "ECViT: Efficient Convolutional Vision Transformer with Local-Attention and Multi-s…☆19Jun 12, 2025Updated 8 months ago
- ☆10May 24, 2021Updated 4 years ago
- memo☆13Dec 22, 2022Updated 3 years ago
- A collection of settings for getting started with Disco Diffusion Portrait Model☆10Aug 31, 2022Updated 3 years ago
- This project demonstrates the use of Deep Learning to detect emotion (sad, angry, happy etc) from the images of faces.☆11Feb 14, 2020Updated 6 years ago
- Code for running forward and backward versions of GPT2☆10Nov 20, 2021Updated 4 years ago
- ☆17Jul 23, 2025Updated 7 months ago
- ☆11Jun 7, 2023Updated 2 years ago