RachanaJayaram / Cross-Attention-VizWiz-VQAView external linksLinks
A self-evident application of the VQA task is to design systems that aid blind people with sight reliant queries. The VizWiz VQA dataset originates from images and questions compiled by members of the visually impaired community and as such, highlights some of the challenges presented by this particular use case.
☆15Dec 12, 2023Updated 2 years ago
Alternatives and similar repositories for Cross-Attention-VizWiz-VQA
Users that are interested in Cross-Attention-VizWiz-VQA are comparing it to the libraries listed below
Sorting:
- ☆23Aug 9, 2021Updated 4 years ago
- PyTorch VQA implementation that achieved top performances in the (ECCV18) VizWiz Grand Challenge: Answering Visual Questions from Blind P…☆63Oct 17, 2018Updated 7 years ago
- Implementation for the journal paper "DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering" (Jianyu et al., IEEE Tran…☆18Jun 22, 2021Updated 4 years ago
- Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": LXMERT…☆21Oct 20, 2020Updated 5 years ago
- Pytorch implementation of VQA: Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf) using VQA v2.0 dataset for open-ended ta…☆21Jul 30, 2020Updated 5 years ago
- PyTorch implementation of L-GCN [https://arxiv.org/abs/2008.09105]☆25Apr 25, 2021Updated 4 years ago
- Repository to perform multi animal pose detection. In particular this code is used for bee pose estimation.☆10Jan 10, 2022Updated 4 years ago
- ☆11Oct 9, 2024Updated last year
- Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding☆33Aug 29, 2019Updated 6 years ago
- A deep learning based application which is entitled to help the visually impaired people. The application automatically generates the tex…☆12Oct 2, 2020Updated 5 years ago
- Used in M4C feature extraction script: https://github.com/facebookresearch/mmf/blob/project/m4c/projects/M4C/scripts/extract_ocr_frcn_fea…☆13Jan 30, 2020Updated 6 years ago
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- [NAACL 2022] TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding☆10Jul 15, 2023Updated 2 years ago
- Load and visualize different datasets in video question answering☆10May 11, 2021Updated 4 years ago
- ☆10Aug 22, 2023Updated 2 years ago
- Multilabel Out-of-Distribution Detection☆10Nov 23, 2020Updated 5 years ago
- An updated PyTorch implementation of hengyuan-hu's version for 'Bottom-Up and Top-Down Attention for Image Captioning and Visual Question…☆35Mar 21, 2022Updated 3 years ago
- ☆12Aug 29, 2019Updated 6 years ago
- SBB Mobile Machine Learning for Android devices☆12Updated this week
- ☆11Jan 14, 2017Updated 9 years ago
- ☆15Mar 27, 2024Updated last year
- ☆14Jul 13, 2021Updated 4 years ago
- ☆13Jun 26, 2021Updated 4 years ago
- ☆10Mar 30, 2022Updated 3 years ago
- video captioning using 3DCNN and LSTM (pytorch)☆11Sep 26, 2019Updated 6 years ago
- SpringCloud微服务入门教程,包含Eureka注册发现、Config配置中心、BUS消息总线、FeignClient客户端 、Zuul网关、Hystrix服务熔断降级、Stream消息队列、Sleuth链路监控、Swagger文档的基本整合演示。☆11Aug 26, 2024Updated last year
- AI-powered study companion for visually impaired students. Developed by Edumakers, from Tecnológico de Monterrey☆12Jun 20, 2024Updated last year
- EMNLP 2020: Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots☆12Dec 15, 2020Updated 5 years ago
- Source repo of Grimm project which is born to help the visually-impaired.☆16Dec 12, 2025Updated 2 months ago
- An application that reads out Braille in English for the visually impaired☆15Jul 16, 2017Updated 8 years ago
- Source Code of rTVRA for Hyperspectral Image Reconstruction on Dual-camera Compressive Hyperspectral Imaging System☆14Dec 23, 2020Updated 5 years ago
- Leveraging Local and Global Patterns for Self-Attention Networks☆12Jun 3, 2019Updated 6 years ago
- The code for reproducing "Frame Difference-Based Temporal Loss"☆11Sep 11, 2021Updated 4 years ago
- Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations☆14Aug 16, 2022Updated 3 years ago
- Smart-I is an android application aimed at helping the visually impaired using artificial intelligence and cloud computing.☆10Apr 13, 2022Updated 3 years ago
- [ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"☆69Oct 11, 2021Updated 4 years ago
- Code and data for the paper "Dual Dynamic Memory Network for End-to-End Multi-turn Task-oriented Dialog Systems".☆14Aug 16, 2022Updated 3 years ago
- Capstone Project: Assist the blind in moving around safely by warning them of impending obstacles using depth sensing, computer vision, a…☆18Nov 24, 2020Updated 5 years ago
- Background resampling for out-of-distribution detection☆13Mar 27, 2020Updated 5 years ago