RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering
☆10Nov 27, 2022Updated 3 years ago
Alternatives and similar repositories for RUArt
Users that are interested in RUArt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answer…☆56Oct 30, 2024Updated last year
- ☆188May 8, 2024Updated last year
- Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.☆65Sep 15, 2021Updated 4 years ago
- TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)☆72May 22, 2023Updated 2 years ago
- The imdb files with SBD-Trans OCR for TextVQA dataset.☆11Nov 30, 2021Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official code for the paper: "Perception and Semantic Aware Regularization for Sequential Confidence Calibration (CVPR2023)"☆10May 15, 2024Updated last year
- Local self-attention in Transformer for visual question answering☆13Mar 17, 2024Updated 2 years ago
- MT终端扩展包-非官方版(Termux环境版)☆18Mar 8, 2026Updated last month
- ☆14May 26, 2023Updated 2 years ago
- ☆15May 10, 2021Updated 4 years ago
- Fine-Grained Knowledge Fusion for Retrieval-Augmented Medical Visual Question☆11Jul 18, 2024Updated last year
- the code for paper: A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question Answering☆13Aug 22, 2023Updated 2 years ago
- Used in M4C feature extraction script: https://github.com/facebookresearch/mmf/blob/project/m4c/projects/M4C/scripts/extract_ocr_frcn_fea…☆13Jan 30, 2020Updated 6 years ago
- ☆13Feb 17, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICLR 2023] “ Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Better Representations”, Ziyu Jian…☆24Feb 16, 2023Updated 3 years ago
- STVQA and TextVQA OCR results from Amazon Text in Image pipeline☆12Jul 18, 2022Updated 3 years ago
- Scene text rectification using glyph and character alignment properties☆22Jan 21, 2018Updated 8 years ago
- Official implementation for the MM'22 paper.☆14Jun 30, 2022Updated 3 years ago
- 在TableBank的基础上,进一步标注到单元格精度,利用目标检测/分割实现单元格定位。☆14Dec 11, 2019Updated 6 years ago
- 收集并整理有关OCR的数据集并统一标注格式,以便实验需要☆12May 17, 2023Updated 2 years ago
- ☆18May 31, 2023Updated 2 years ago
- [CVPR 2023] Pytorch Code of MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering☆17Jul 11, 2023Updated 2 years ago
- Tutorial demonstrating how to leverage Pytorch and its features to carry out Information Extraction.☆11Dec 1, 2020Updated 5 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- HTML5 Application to manipulate a Coons Bicubic Surface in 3D using its corner points, U and W tangents and UW twists.☆11Aug 19, 2019Updated 6 years ago
- The official implementation of "Low-power, Continuous Remote Behavioral Localization with Event Cameras" (CVPR 2024)☆12Sep 25, 2024Updated last year
- ☆10Nov 15, 2021Updated 4 years ago
- Graph Key Information Extraction: GKIE☆11Sep 15, 2022Updated 3 years ago
- A simple toolkit for processing event-based data.☆13Apr 7, 2026Updated last week
- The code of IJCAI2022 paper, Declaration-based Prompt Tuning for Visual Question Answering☆20May 10, 2022Updated 3 years ago
- The code of SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models☆22Mar 25, 2026Updated 3 weeks ago
- ☆12Mar 7, 2019Updated 7 years ago
- [IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering☆20Jul 6, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Pytorch unofficial implementation of CMT☆13Jul 16, 2021Updated 4 years ago
- [ACM MM 2020] Exploring Font-independent Features for Scene Text Recognition☆44Nov 30, 2020Updated 5 years ago
- Using image captions with LLM for zero-shot VQA☆19Mar 14, 2024Updated 2 years ago
- A Better Way to Attend: Attention with Trees for Video Question Answering☆25Mar 25, 2019Updated 7 years ago
- 一个生成crnn训练数据集的工具,主要针对简体中文。☆15Apr 19, 2022Updated 4 years ago
- [MAC 2024] The baseline code for MAC 2024.☆12Jun 3, 2025Updated 10 months ago
- Attention-based sampler in TASN (Trilinear Attention Sampling Network)☆23Jun 8, 2020Updated 5 years ago