AI4Bharat / DocSimLinks
Synthetically generate random text document images with ground-truth
☆11Updated 4 years ago
Alternatives and similar repositories for DocSim
Users that are interested in DocSim are comparing it to the libraries listed below
Sorting:
- TableNet Implementation on Pytorch☆148Updated 2 years ago
- ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction☆401Updated 4 years ago
- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:☆278Updated 2 years ago
- Pytorch Implementation of Chargrid Paper (https://arxiv.org/abs/1809.08799)☆27Updated 3 years ago
- Python library to extract tabular data from images and scanned PDFs☆277Updated 11 months ago
- CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)☆157Updated 2 years ago
- ICDAR 2019: MaskRCNN on PubLayNet datasets. Paragraph detection, table detection, figure detection,...☆182Updated 4 years ago
- Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Docum…☆328Updated 2 years ago
- ☆371Updated last year
- Pytorch implementation of our paper: Adapting OCR with Limited Labels☆63Updated last year
- Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes☆436Updated last month
- Detectron2 for Document Layout Analysis☆188Updated 11 months ago
- CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images☆132Updated 5 months ago
- Document Layout Analysis☆379Updated last month
- Detect the tables in a form and extract the tables as well as the cells of the tables.☆64Updated 4 years ago
- Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)☆62Updated 3 years ago
- Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset☆48Updated 2 years ago
- Library used to deskew a scanned document☆473Updated this week
- A comprehensive tutorial for OCR in python using Tesseract-OCR and OpenCV☆123Updated 3 years ago
- Document Image Enhancement with GANs - TPAMI journal☆201Updated 2 years ago
- CORD: A Consolidated Receipt Dataset for Post-OCR Parsing☆436Updated 3 years ago
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆355Updated 2 years ago
- ☆130Updated 2 years ago
- BoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on sc…☆110Updated 2 years ago
- ICIP 2022: Adaptive Radial Projection on Fourier Magnitude Spectrum for Document Image Skew Estimation☆141Updated 2 months ago
- ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation (CVPR20)☆273Updated 4 years ago
- Document Layout Analysis resources repos for development with PdfPig.☆621Updated last year
- Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Characte…☆208Updated 6 months ago
- # Denoising Dirty Documents Optical Character Recognition (OCR) is the process of getting type or handwritten documents into a digitized …☆10Updated 4 years ago
- ☆143Updated 5 years ago