PedroBarcha / old-books-dataset
Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binarization). Noised and denoised sets (done by several methods) are eventually going to be uploaded.
☆12Updated 7 years ago
Alternatives and similar repositories for old-books-dataset:
Users that are interested in old-books-dataset are comparing it to the libraries listed below
- ☆9Updated 5 years ago
- ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction☆27Updated 5 years ago
- ☆21Updated 2 years ago
- Text and Layout Document Image Understanding. LayoutLM☆23Updated 3 years ago
- A Unet based deeplearning model to line/box/spurious artifacts from text images. Unsupervised training.☆58Updated 5 years ago
- ☆15Updated 4 years ago
- Code for my ICDAR paper "Deep Visual Template-Free Form Parsing"☆88Updated 3 years ago
- Extraction of meaningful instances from document images with a Chargrid model☆34Updated 3 years ago
- Detect textlines in document images☆92Updated 10 months ago
- Key Information Extraction From Documents: Evaluation And Generator☆20Updated 4 years ago
- DIAR software for synthetic document image and groundtruth generation, with various degradation models for data augmentation☆120Updated last year
- Code for the ICDAR2021 paper "Visual FUDGE: Form Understanding via Dynamic Graph Editing"☆33Updated 3 years ago
- TensorFlow implementation of a segmentation system for document images.☆34Updated 6 years ago
- TextTron is a simple light-weight image processing based text detector for document images.☆52Updated 4 years ago
- ShabbyPages is a state-of-the-art corpus of born-digital document images with both ground truth and distorted versions appropriate for us…☆57Updated 2 weeks ago
- DL models that take a document image file as input, locate the position of paragraphs, lines, images, etc. with their labels and confiden…☆26Updated 4 years ago
- Document Scanner and Word Segmentation☆122Updated 4 years ago
- Line Segmentation of Handwritten Documents using the A* Path Planning Algorithm☆27Updated 4 years ago
- [WIP] A Pytorch implementation of DB-Text - Real-time Scene Text Detection with Differentiable Binarization☆38Updated 2 years ago
- ☆12Updated 4 years ago
- Publicly released code for the LAMBERT model☆103Updated 3 years ago
- A Dense Text Detection model using Receptive Field Blocks☆31Updated 2 years ago
- CVPR 2022: Table Structure Recognition☆39Updated 2 years ago
- Repo to host the forms dataset☆15Updated 4 years ago
- Convolutional recurrent neural network for scene text recognition or OCR in Keras☆125Updated 3 years ago
- Public implementation of our CVPR Paper "OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page TextRecognition by learnin…☆144Updated 3 years ago
- code for participation in ICDAR2021 Table Recognition track (Team Name: LTIAYN = Kaen Context)☆21Updated 3 years ago
- ☆138Updated last year
- list all open dataset about ocr.☆100Updated 7 years ago
- Document Image Augmentation is tool for Augmenting axis align document images☆32Updated 4 years ago