AI-Application-and-Integration-Lab / Scene-Text-Detection-And-Recognition-Model_M503
☆13Updated 11 months ago
Alternatives and similar repositories for Scene-Text-Detection-And-Recognition-Model_M503:
Users that are interested in Scene-Text-Detection-And-Recognition-Model_M503 are comparing it to the libraries listed below
- ☆14Updated 2 years ago
- An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Informat…☆53Updated last year
- ☆128Updated 11 months ago
- ☆175Updated 6 months ago
- Applied Deep Learning (2021 Spring) at National Taiwan University (NTU) CSIE☆9Updated 3 years ago
- Document Artifical Intelligence☆141Updated last month
- Code for CVPR21 paper A Multiplexed Network for End-to-End, Multilingual OCR☆80Updated 2 years ago
- ☆58Updated last year
- [NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning☆89Updated 3 weeks ago
- Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train …☆183Updated 4 months ago
- OCR Annotations from Amazon Textract for Industry Documents Library☆101Updated 2 years ago
- ☆40Updated 8 months ago
- An implementation of "CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model".☆128Updated last month
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆42Updated 9 months ago
- Official Implementation of SCOB [ICCV 2023]☆22Updated last year
- Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models, EMNLP 2023☆44Updated 7 months ago
- Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.☆16Updated last year
- ☆36Updated 8 months ago
- ☆66Updated 5 months ago
- Code for ICCV 2023 Paper : “ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction”☆50Updated last year
- The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++:…☆255Updated 5 months ago
- The code related to the baselines from NeurIPS 2021 paper "DUE: End-to-End Document Understanding Benchmark."☆36Updated last year
- An open-source implementaion for fine-tuning Qwen2-VL series by Alibaba Cloud.☆173Updated this week
- This is the official repository for Retrieval Augmented Visual Question Answering☆202Updated last month
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆93Updated last month
- Multimodal Semi-Supervised Learning for Text Recognition (SemiMTR)☆81Updated last year
- Instruction tuning dataset generation inspired by LLaVA-Instruct-158k via any LLM, also for commercial use.☆12Updated 10 months ago
- ☆109Updated 6 months ago
- Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.☆174Updated last month
- ☆35Updated last year