A self-evident application of the VQA task is to design systems that aid blind people with sight reliant queries. The VizWiz VQA dataset originates from images and questions compiled by members of the visually impaired community and as such, highlights some of the challenges presented by this particular use case.
☆15Dec 12, 2023Updated 2 years ago
Alternatives and similar repositories for Cross-Attention-VizWiz-VQA
Users that are interested in Cross-Attention-VizWiz-VQA are comparing it to the libraries listed below
Sorting:
- ☆29Mar 24, 2018Updated 7 years ago
- PyTorch VQA implementation that achieved top performances in the (ECCV18) VizWiz Grand Challenge: Answering Visual Questions from Blind P…☆63Oct 17, 2018Updated 7 years ago
- Implementation for the journal paper "DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering" (Jianyu et al., IEEE Tran…☆18Jun 22, 2021Updated 4 years ago
- Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering☆29Jul 1, 2024Updated last year
- Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": LXMERT…☆21Oct 20, 2020Updated 5 years ago
- Pytorch implementation of VQA: Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf) using VQA v2.0 dataset for open-ended ta…☆21Jul 30, 2020Updated 5 years ago
- PyTorch implementation of L-GCN [https://arxiv.org/abs/2008.09105]☆25Apr 25, 2021Updated 4 years ago
- Code for ACL 2020 paper "Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA." Hyounghun Kim, Zineng T…☆34May 14, 2020Updated 5 years ago
- Repository to perform multi animal pose detection. In particular this code is used for bee pose estimation.☆10Jan 10, 2022Updated 4 years ago
- Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding☆33Aug 29, 2019Updated 6 years ago
- ☆11Oct 9, 2024Updated last year
- Intelligent virtual patient research based on medical knowledge graph 虚拟病人、医学知识图谱☆12Aug 16, 2019Updated 6 years ago
- 使用python语言的Django框架写的一个个人购物网站☆10Jul 10, 2018Updated 7 years ago
- It is a virtual assistant for visually impaired which include models like face recognition, object detection, text to speech, speech reco…☆44Dec 8, 2022Updated 3 years ago
- django电商 已部署到服务器☆10Dec 8, 2022Updated 3 years ago
- [NAACL 2022] TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding☆10Jul 15, 2023Updated 2 years ago
- ☆10Aug 22, 2023Updated 2 years ago
- ☆11Jun 7, 2023Updated 2 years ago
- Load and visualize different datasets in video question answering☆10May 11, 2021Updated 4 years ago
- ☆12Jun 18, 2024Updated last year
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- Multilabel Out-of-Distribution Detection☆10Nov 23, 2020Updated 5 years ago
- An updated PyTorch implementation of hengyuan-hu's version for 'Bottom-Up and Top-Down Attention for Image Captioning and Visual Question…☆35Mar 21, 2022Updated 3 years ago
- ☆14Jul 13, 2021Updated 4 years ago
- AAAI 2020. Spatial-Temporal Synchronous Graph Convolutional Networks: A New Framework for Spatial-Temporal Network Data Forecasting☆13Dec 20, 2019Updated 6 years ago
- ☆13Jun 26, 2021Updated 4 years ago
- ☆11Jan 14, 2017Updated 9 years ago
- AIML开源框架☆10Aug 18, 2018Updated 7 years ago
- ☆10Mar 30, 2022Updated 3 years ago
- ☆12Aug 29, 2019Updated 6 years ago
- SBB Mobile Machine Learning for Android devices☆12Feb 12, 2026Updated 3 weeks ago
- The code for reproducing "Frame Difference-Based Temporal Loss"☆11Sep 11, 2021Updated 4 years ago
- ☆11Feb 4, 2024Updated 2 years ago
- The LaTeX template of experiment report, XDU.☆13Dec 7, 2020Updated 5 years ago
- Source Code of rTVRA for Hyperspectral Image Reconstruction on Dual-camera Compressive Hyperspectral Imaging System☆14Dec 23, 2020Updated 5 years ago
- Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations☆14Aug 16, 2022Updated 3 years ago
- EMNLP 2020: Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots☆12Dec 15, 2020Updated 5 years ago
- [ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"☆69Oct 11, 2021Updated 4 years ago
- Codes for our ACM MM 2019 paper: "Exploiting Temporal Relationships in Video Moment Localization with Natural Language"☆16Oct 22, 2022Updated 3 years ago