PedroBarcha / old-books-datasetLinks
Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binarization). Noised and denoised sets (done by several methods) are eventually going to be uploaded.
☆15Updated 8 years ago
Alternatives and similar repositories for old-books-dataset
Users that are interested in old-books-dataset are comparing it to the libraries listed below
Sorting:
- Code for the ICDAR2021 paper "Visual FUDGE: Form Understanding via Dynamic Graph Editing"☆33Updated 3 years ago
- ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction☆27Updated 6 years ago
- https://betterprogramming.pub/table-detection-and-extraction-tablenet-deep-learning-model-with-pytorch-from-images-64489e92b641☆15Updated 2 years ago
- CVPR 2022: Table Structure Recognition☆40Updated 3 years ago
- Pytorch Implementation of TableNet☆67Updated 4 years ago
- Text and Layout Document Image Understanding. LayoutLM☆23Updated 4 years ago
- TableNet Implementation on Pytorch☆150Updated 3 years ago
- Handwritten text recognition using transformers.☆158Updated last year
- Pytorch implementation of our paper: Adapting OCR with Limited Labels☆62Updated last year
- Key Information Extraction From Documents: Evaluation And Generator☆20Updated 4 years ago
- Close-Domain fine-tuning for table detection☆72Updated 3 years ago
- DocILE: Document Information Localization and Extraction Benchmark☆139Updated last year
- ☆16Updated 4 years ago
- CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images☆133Updated 3 months ago
- ☆127Updated 5 years ago
- ☆15Updated 5 years ago
- Document Visual Question Answering☆128Updated 5 years ago
- DIAR software for synthetic document image and groundtruth generation, with various degradation models for data augmentation☆129Updated 2 years ago
- CLEval: Character-Level Evaluation for Text Detection and Recognition Tasks☆186Updated 2 years ago
- TextTron is a simple light-weight image processing based text detector for document images.☆53Updated 4 years ago
- ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction☆33Updated 3 years ago
- Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION☆79Updated 2 years ago
- Research papers and code on information extraction from image/pdf☆97Updated 3 years ago
- Extraction of meaningful instances from document images with a Chargrid model☆34Updated 4 years ago
- Detectron2 for Document Layout Analysis☆187Updated last year
- A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers☆59Updated last year
- Sample implementation of OCR metrics (CER, WER) calculation with TesseractOCR and fastwer☆29Updated 4 years ago
- ☆141Updated last year
- ☆18Updated last year
- document image degradation☆163Updated 5 years ago