☆22Dec 8, 2022Updated 3 years ago
Alternatives and similar repositories for TAG
Users that are interested in TAG are comparing it to the libraries listed below
Sorting:
- TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)☆72May 22, 2023Updated 2 years ago
- Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.☆16May 1, 2025Updated 10 months ago
- ☆15Oct 27, 2020Updated 5 years ago
- [AAAI 2021] Confidence-aware Non-repetitive Multimodal Transformers for TextCaps☆24Mar 29, 2023Updated 2 years ago
- Official repository of the paper: "A Comprehensive Gold Standard and Benchmark for Comics Text Detection and Recognition"☆26Jul 10, 2023Updated 2 years ago
- Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answer…☆55Oct 30, 2024Updated last year
- OCR Annotations from Amazon Textract for Industry Documents Library☆103Aug 20, 2022Updated 3 years ago
- PyTorch implementation of BMVC2022 paper Masked Vision-Language Transformers for Scene Text Recognition☆29Nov 11, 2022Updated 3 years ago
- Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]☆57Apr 5, 2022Updated 3 years ago
- ☆69Jan 9, 2024Updated 2 years ago
- The official code of Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition (IJCAI2023)☆27Sep 3, 2023Updated 2 years ago
- Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.☆65Sep 15, 2021Updated 4 years ago
- ☆34Jun 22, 2023Updated 2 years ago
- A instruction data generation system for multimodal language models.☆35Jan 31, 2025Updated last year
- Repository for "CoMix: Comprehensive Benchmark for Multi-Task Comic Understanding"☆16Nov 20, 2024Updated last year
- RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering☆10Nov 27, 2022Updated 3 years ago
- WILL™ SDK for ink supports a variety of input technologies and generates the highest quality, most attractive digital ink outputs via the…☆12Jul 1, 2024Updated last year
- ☆10Jan 20, 2024Updated 2 years ago
- Online BaseHangul Encoder And Decoder☆12Jan 30, 2023Updated 3 years ago
- Learning Low-rank and Sparse Discriminative Correlation Filters for Coarse-to-Fine Visual Object Tracking☆10Apr 15, 2021Updated 4 years ago
- [ACM MM 2020] Exploring Font-independent Features for Scene Text Recognition☆44Nov 30, 2020Updated 5 years ago
- Library for the Test-based Calibration Error (TCE) metric to quantify the degree to classifier calibration.☆13Sep 15, 2023Updated 2 years ago
- Implementation of Mutan+ArticleNet on OKVQA☆10Jan 11, 2021Updated 5 years ago
- Laphet: A tiny neural network language modeling library designed for students and educators.☆11Feb 9, 2025Updated last year
- Used in M4C feature extraction script: https://github.com/facebookresearch/mmf/blob/project/m4c/projects/M4C/scripts/extract_ocr_frcn_fea…☆13Jan 30, 2020Updated 6 years ago
- Universal Dependency Tree for Myanmar Language☆10Feb 9, 2025Updated last year
- Recipes to prepare datasets!☆15Dec 6, 2025Updated 2 months ago
- https://wavelandspeech.github.io/☆10Jan 12, 2024Updated 2 years ago
- ☆10May 9, 2019Updated 6 years ago
- ☆10Jul 23, 2021Updated 4 years ago
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Models☆15Mar 8, 2023Updated 2 years ago
- ☆11Jul 3, 2023Updated 2 years ago
- Tips for creating a beautiful table via booktabs☆17Apr 28, 2024Updated last year
- Official Pytorch codebase for Open-Vocabulary Instance Segmentation without Manual Mask Annotations [CVPR 2023]☆52Oct 26, 2025Updated 4 months ago
- The official source code for TaleBrush (CHI 2022)☆15Jul 13, 2022Updated 3 years ago
- Multiple Anchor Learning for Visual Object Detection (CVPR,2020)☆14Mar 18, 2021Updated 4 years ago
- This repository contains the dataset used to train the neural network model descried in the paper "Implicit HRTF Modeling Using Tempora…☆11Aug 4, 2023Updated 2 years ago
- Lens Protocol Starter Kit☆12Jun 5, 2022Updated 3 years ago
- Official implementation of "What does CLIP know about a red circle? Visual Prompt Engineering for VLMs", ICCV 2023☆11Sep 21, 2023Updated 2 years ago