Contrast-guided Feature Adjustment Module for Visual Information Extraction
☆30May 23, 2023Updated 2 years ago
Alternatives and similar repositories for CFAM
Users that are interested in CFAM are comparing it to the libraries listed below
Sorting:
- Code for ICCV 2023 Paper : “ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction”☆54Aug 8, 2023Updated 2 years ago
- Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”☆18Dec 6, 2022Updated 3 years ago
- HHH☆36May 2, 2022Updated 3 years ago
- DocILE: Document Information Localization and Extraction Benchmark☆142May 15, 2024Updated last year
- This repository is a concise collection of well known deep learning based document binarization models.☆27Dec 24, 2022Updated 3 years ago
- ☆14May 26, 2023Updated 2 years ago
- PyTorch implementation of BMVC2022 paper Masked Vision-Language Transformers for Scene Text Recognition☆29Nov 11, 2022Updated 3 years ago
- A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)☆106Dec 9, 2021Updated 4 years ago
- ☆18Jun 7, 2023Updated 2 years ago
- [MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking f…☆20Dec 4, 2024Updated last year
- We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datas…☆80Feb 8, 2023Updated 3 years ago
- Official implementation for Dessurt: Document end-to-end self-supervised understanding and recognition transformer☆62Jan 11, 2023Updated 3 years ago
- InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)☆162May 31, 2024Updated last year
- Official code for the paper: "Perception and Semantic Aware Regularization for Sequential Confidence Calibration (CVPR2023)"☆10May 15, 2024Updated last year
- TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning☆23Sep 17, 2024Updated last year
- Official PyTorch implementation of `Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition`☆74Feb 27, 2023Updated 3 years ago
- Official implementation for "GLASS: Global to Local Attention for Scene-Text Spotting" (ECCV'22)☆102Jun 28, 2024Updated last year
- ☆132Mar 24, 2023Updated 2 years ago
- Evaluation Tool for the ICDAR 2019 Competition on Table Detection and Recognition☆42May 8, 2022Updated 3 years ago
- EATEN: Entity-aware Attention for Single Shot Visual Text Extraction☆184Dec 29, 2019Updated 6 years ago
- Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understan…☆361Oct 31, 2022Updated 3 years ago
- ☆12Jun 11, 2023Updated 2 years ago
- Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition☆28Aug 29, 2023Updated 2 years ago
- Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.☆203Mar 1, 2025Updated last year
- A new video text spotting framework with Transformer☆78May 23, 2022Updated 3 years ago
- The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++:…☆282May 30, 2025Updated 9 months ago
- XFUND: A Multilingual Form Understanding Benchmark☆217Jul 15, 2022Updated 3 years ago
- Code and data for the paper: DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator☆12Apr 28, 2024Updated last year
- ☆16Jan 30, 2022Updated 4 years ago
- A face detection base on faster-rcnn.pytorch☆10Feb 9, 2018Updated 8 years ago
- It's the code for the paper Pushing the Performance Limit of Scene Text Recognizer without Human Annotation, CVPR 2022.☆28Jul 6, 2022Updated 3 years ago
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆63May 15, 2025Updated 9 months ago
- Code for the ICDAR2021 paper "Visual FUDGE: Form Understanding via Dynamic Graph Editing"☆33Mar 4, 2022Updated 3 years ago
- ☆81Jun 12, 2023Updated 2 years ago
- ☆17Jun 12, 2024Updated last year
- [CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding☆27Dec 18, 2025Updated 2 months ago
- Code for "Primitive Representation Learning for Scene Text Recognition" (CVPR 2021)☆82May 11, 2022Updated 3 years ago
- Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)☆126Nov 13, 2023Updated 2 years ago
- The HierText dataset contains ~12k images from the Open Images dataset v6 with large amount of text entities. We provide word, line and p…☆306Dec 2, 2024Updated last year