xiaojino/RUArt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xiaojino/RUArt)

xiaojino / RUArt

RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering

☆10

Alternatives and similar repositories for RUArt

Users that are interested in RUArt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

uakarsh / latr
View on GitHub
Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answer…
☆56Updated this week
xinke-wang / Awesome-Text-VQA
View on GitHub
☆188May 8, 2024Updated 2 years ago
yashkant / sam-textvqa
View on GitHub
Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.
☆65Sep 15, 2021Updated 4 years ago
microsoft / TAP
View on GitHub
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)
☆72May 22, 2023Updated 3 years ago
ChenyuGAO-CS / SMA
View on GitHub
The imdb files with SBD-Trans OCR for TextVQA dataset.
☆11Nov 30, 2021Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
husterpzh / PSSR
View on GitHub
Official code for the paper: "Perception and Semantic Aware Regularization for Sequential Confidence Calibration （CVPR2023）"
☆10May 15, 2024Updated 2 years ago
shenxiang-vqa / LSAT
View on GitHub
Local self-attention in Transformer for visual question answering
☆13Mar 17, 2024Updated 2 years ago
Xiaomeng-Yang / STR_benchmark_cleansed
View on GitHub
☆14May 26, 2023Updated 3 years ago
prdwb / okvqa-release
View on GitHub
☆15May 10, 2021Updated 5 years ago
alirezasalemi7 / DEDR-MM-FiD
View on GitHub
the code for paper: A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question Answering
☆14Aug 22, 2023Updated 2 years ago
ecoxial2007 / FGRW_MedVQA
View on GitHub
Fine-Grained Knowledge Fusion for Retrieval-Augmented Medical Visual Question
☆11Jul 18, 2024Updated 2 years ago
ronghanghu / vqa-maskrcnn-benchmark-m4c
View on GitHub
Used in M4C feature extraction script: https://github.com/facebookresearch/mmf/blob/project/m4c/projects/M4C/scripts/extract_ocr_frcn_fea…
☆13Jan 30, 2020Updated 6 years ago
amarnaths0005 / coonsBicubicSurface
View on GitHub
HTML5 Application to manipulate a Coons Bicubic Surface in 3D using its corner points, U and W tangents and UW twists.
☆11Aug 19, 2019Updated 6 years ago
VITA-Group / layerGraftedPretraining_ICLR23
View on GitHub
[ICLR 2023] “ Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Better Representations”, Ziyu Jian…
☆24Feb 16, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
taeho-kil / Scene-Text-Rectification
View on GitHub
Scene text rectification using glyph and character alignment properties
☆22Jan 21, 2018Updated 8 years ago
furkanbiten / stvqa_amazon_ocr
View on GitHub
STVQA and TextVQA OCR results from Amazon Text in Image pipeline
☆12Jul 18, 2022Updated 4 years ago
YZHJessica / CDVQA
View on GitHub
☆14Feb 17, 2023Updated 3 years ago
guoyang9 / UnifER
View on GitHub
Official implementation for the MM'22 paper.
☆14Jun 30, 2022Updated 4 years ago
Rid7 / OCR_DataSet
View on GitHub
收集并整理有关OCR的数据集并统一标注格式，以便实验需要
☆12May 17, 2023Updated 3 years ago
zoujuny / TableCell
View on GitHub
在TableBank的基础上，进一步标注到单元格精度，利用目标检测/分割实现单元格定位。
☆14Dec 11, 2019Updated 6 years ago
val-iisc / RMLVQA
View on GitHub
☆19May 31, 2023Updated 3 years ago
jingjing12110 / MixPHM
View on GitHub
[CVPR 2023] Pytorch Code of MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering
☆17Jul 11, 2023Updated 3 years ago
tub-rip / event_penguins
View on GitHub
The official implementation of "Low-power, Continuous Remote Behavioral Localization with Event Cameras" (CVPR 2024)
☆13Sep 25, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
MbassiJaphet / pytorch-for-information-extraction
View on GitHub
Tutorial demonstrating how to leverage Pytorch and its features to carry out Information Extraction.
☆11Dec 1, 2020Updated 5 years ago
mavillot / FUNSD-Entity-Linking
View on GitHub
☆10Nov 15, 2021Updated 4 years ago
bilal-rachik / Information-extraction-from-document
View on GitHub
Graph Key Information Extraction: GKIE
☆11Sep 15, 2022Updated 3 years ago
CCIIPLab / DPT
View on GitHub
The code of IJCAI2022 paper, Declaration-based Prompt Tuning for Visual Question Answering
☆20May 10, 2022Updated 4 years ago
Actasidiot / EFIFSTR
View on GitHub
[ACM MM 2020] Exploring Font-independent Features for Scene Text Recognition
☆44Nov 30, 2020Updated 5 years ago
KugaMaxx / yam-toolkit
View on GitHub
A simple toolkit for processing event-based data.
☆13Apr 7, 2026Updated 3 months ago
yuranusduke / CMT-Convolutional-NN-Meets-ViT
View on GitHub
Pytorch unofficial implementation of CMT
☆13Jul 16, 2021Updated 5 years ago
longbai1006 / Surgical-VQLAPlus
View on GitHub
Official Implementation of "Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question Localized-Answering i…
☆15May 6, 2025Updated last year
ovguyo / captions-in-VQA
View on GitHub
Using image captions with LLM for zero-shot VQA
☆19Mar 14, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ZJULearning / TreeAttention
View on GitHub
A Better Way to Attend: Attention with Trees for Video Question Answering
☆25Mar 25, 2019Updated 7 years ago
VUT-HFUT / MAC_2024_baseline
View on GitHub
[MAC 2024] The baseline code for MAC 2024.
☆12Jun 3, 2025Updated last year
lvjianjin / TextRecognitionDataGenerator
View on GitHub
一个生成crnn训练数据集的工具，主要针对简体中文。
☆15Apr 19, 2022Updated 4 years ago
HCIILAB / LAST
View on GitHub
Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition
☆28Aug 29, 2023Updated 2 years ago
tokokudo / mesa-Panfork-android
View on GitHub
Mali G610 & 710 GPU Driver for Termux
☆16Mar 15, 2026Updated 4 months ago
wkcn / AttentionSampler
View on GitHub
Attention-based sampler in TASN (Trilinear Attention Sampling Network)
☆23Jun 8, 2020Updated 6 years ago
HAWLYQ / Qc-TextCap
View on GitHub
☆16Dec 25, 2021Updated 4 years ago