PedroBarcha / old-books-datasetLinks
Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binarization). Noised and denoised sets (done by several methods) are eventually going to be uploaded.
☆15Updated 8 years ago
Alternatives and similar repositories for old-books-dataset
Users that are interested in old-books-dataset are comparing it to the libraries listed below
Sorting:
- Code for the ICDAR2021 paper "Visual FUDGE: Form Understanding via Dynamic Graph Editing"☆33Updated 3 years ago
- Pytorch implementation of our paper: Adapting OCR with Limited Labels☆62Updated 2 years ago
- ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction☆33Updated 3 years ago
- Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION☆79Updated 2 years ago
- Handwritten text recognition using transformers.☆158Updated last year
- CVPR 2022: Table Structure Recognition☆40Updated 3 years ago
- Extraction of meaningful instances from document images with a Chargrid model☆34Updated 4 years ago
- ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction☆27Updated 6 years ago
- Close-Domain fine-tuning for table detection☆72Updated 3 years ago
- A Dense Text Detection model using Receptive Field Blocks☆31Updated 3 years ago
- CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images☆133Updated 4 months ago
- TableNet: Deep Learning model for end-to-end Table Detection and Tabular data extraction from Scanned Data Images In modern times, more a…☆62Updated 3 years ago
- TableNet Implementation on Pytorch☆150Updated 3 years ago
- Deep learning, Convolutional neural networks, Image processing, Document processing, Table detection, Page object detection, Table classi…☆70Updated last year
- ☆18Updated last year
- DocILE: Document Information Localization and Extraction Benchmark☆139Updated last year
- Model for document segmentation trained on the midv-500-models dataset.☆78Updated 5 years ago
- https://betterprogramming.pub/table-detection-and-extraction-tablenet-deep-learning-model-with-pytorch-from-images-64489e92b641☆15Updated 2 years ago
- ☆87Updated 5 years ago
- Pytorch Implementation of TableNet☆67Updated 4 years ago
- Form images from U.S. National Archives annotated with text bounding boxes, classes, relationships, and transcription.☆38Updated 3 years ago
- Attention-based sequence-to-sequence model for handwritten word recognition☆62Updated last year
- This repository contains a 403 images dataset for table detection in documents.☆83Updated 7 years ago
- Research papers and code on information extraction from image/pdf☆97Updated 3 years ago
- Key Information Extraction From Documents: Evaluation And Generator☆20Updated 4 years ago
- document image degradation☆164Updated 5 years ago
- Code for my ICDAR paper "Deep Visual Template-Free Form Parsing"☆89Updated 3 years ago
- ☆16Updated 4 years ago
- Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Characte…☆237Updated last year
- Evaluation of the Layoutlm model on the CORD dataset☆32Updated 3 years ago