The code related to the baselines from NeurIPS 2021 paper "DUE: End-to-End Document Understanding Benchmark."
☆36Mar 2, 2023Updated 2 years ago
Alternatives and similar repositories for baselines
Users that are interested in baselines are comparing it to the libraries listed below
Sorting:
- Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.☆16May 1, 2025Updated 10 months ago
- A fast and highly accurate differentiable Top-k operator from the "Successive Halving Top-k Operator" AAAI'21 paper.☆16Jun 1, 2021Updated 4 years ago
- baselines for DocVQA dataset☆21Apr 11, 2021Updated 4 years ago
- Publicly released code for the LAMBERT model☆105Jun 14, 2021Updated 4 years ago
- This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper "Sparsifyi…☆14May 15, 2022Updated 3 years ago
- Official implementation for Dessurt: Document end-to-end self-supervised understanding and recognition transformer☆62Jan 11, 2023Updated 3 years ago
- Data and additional information regarding the paper: Contract Discovery. Dataset and a Few-Shot Semantic Retrieval Challenge with Competi…☆32Nov 12, 2020Updated 5 years ago
- https://www.nlp.ecei.tohoku.ac.jp/projects/aio/☆16Aug 4, 2022Updated 3 years ago
- OCR Annotations from Amazon Textract for Industry Documents Library☆103Aug 20, 2022Updated 3 years ago
- Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understan…☆361Oct 31, 2022Updated 3 years ago
- ☆21Jul 11, 2022Updated 3 years ago
- MFAQ: a Multilingual FAQ Dataset☆18Sep 17, 2023Updated 2 years ago
- DocILE: Document Information Localization and Extraction Benchmark☆142May 15, 2024Updated last year
- Elasticsearch TMDB examples☆21Jul 6, 2024Updated last year
- ☆81Jun 12, 2023Updated 2 years ago
- A curated list of resources for Document Understanding (DU) topic☆1,501Jun 2, 2023Updated 2 years ago
- An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Informat…☆53Jan 9, 2024Updated 2 years ago
- ☆59Aug 18, 2021Updated 4 years ago
- This repository is created to share current progress of transformer based optical character recognition(OCR). Welcome to share~☆55Oct 9, 2023Updated 2 years ago
- VisualMRC: Machine Reading Comprehension on Document Images (AAAI2021)☆57Mar 31, 2025Updated 11 months ago
- ☆108Feb 16, 2021Updated 5 years ago
- Document Visual Question Answering☆131Jul 30, 2020Updated 5 years ago
- Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.☆203Mar 1, 2025Updated last year
- Code for the ICDAR2021 paper "Visual FUDGE: Form Understanding via Dynamic Graph Editing"☆33Mar 4, 2022Updated 3 years ago
- ☆680Jun 3, 2025Updated 8 months ago
- DocBank: A Benchmark Dataset for Document Layout Analysis☆634Aug 12, 2024Updated last year
- ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction☆33Jul 20, 2022Updated 3 years ago
- OCR & Ground Truth Resources☆78May 3, 2022Updated 3 years ago
- Image classification for Recyclables☆10Sep 14, 2020Updated 5 years ago
- mReasoner is a unified computational implementation of the model theory of thinking and reasoning☆13Aug 17, 2023Updated 2 years ago
- Perspectrum: a dataset of claims, perspectives and evidence documents☆34Jan 16, 2020Updated 6 years ago
- Bi-encoder Based Entity Linking Tutorial. You can run experiment only in 5 minutes. Experiments on Co-lab pro GPU are also supported!☆34May 3, 2021Updated 4 years ago
- Flexible classic and NeurAl Retrieval Toolkit☆221Jun 28, 2025Updated 8 months ago
- CORD: A Consolidated Receipt Dataset for Post-OCR Parsing☆465Jul 20, 2022Updated 3 years ago
- This repository contains the code used to produce the results from the paper Automated Cross-prompt Scoring of Essay Traits published in …☆32Jan 20, 2022Updated 4 years ago
- Dataset introduced in PlotQA: Reasoning over Scientific Plots☆83Jun 20, 2023Updated 2 years ago
- 「賞金で二郎一生分食べたい!」チームのレポジトリです.☆11Dec 9, 2021Updated 4 years ago
- C4RepSet: Representative Subset from C4 data for Training Pre-trained LMs☆11Jan 13, 2023Updated 3 years ago
- Security research organization dedicated to finding low hanging, critical, vulnerabilities.☆15May 12, 2022Updated 3 years ago