A tutorial on the PyTorch-based ocropus components.
☆73Apr 18, 2020Updated 5 years ago
Alternatives and similar repositories for das2018-tutorial
Users that are interested in das2018-tutorial are comparing it to the libraries listed below
Sorting:
- ☆25Apr 18, 2020Updated 5 years ago
- ☆72Jun 13, 2018Updated 7 years ago
- ☆20Aug 18, 2019Updated 6 years ago
- Repository collecting all the submodules for the new PyTorch-based OCR System.☆142Feb 22, 2021Updated 5 years ago
- ☆126Apr 18, 2020Updated 5 years ago
- ☆10Mar 16, 2023Updated 3 years ago
- Next generation OCR engine based on LSTMs.☆51Apr 8, 2018Updated 7 years ago
- Segmenting a given document using recursive xy-cut algorithm.☆12Oct 9, 2018Updated 7 years ago
- OCRopus model for Gothic print (Fraktur)☆19Feb 16, 2020Updated 6 years ago
- Rotation and skew detection using DL.☆60May 29, 2018Updated 7 years ago
- document image degradation☆164May 18, 2020Updated 5 years ago
- Process, enhance and evaluate multiple OCR output.☆24Dec 2, 2025Updated 3 months ago
- Glyph Miner, a system for extracting glyphs from early typeset prints☆34Sep 29, 2016Updated 9 years ago
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆24Jul 18, 2019Updated 6 years ago
- DFKI Layout Detection for OCR-D☆47May 1, 2025Updated 10 months ago
- ☆14Apr 18, 2020Updated 5 years ago
- Ergonomic line-by-line transcription of scanned text.☆54Feb 2, 2026Updated last month
- Augment line images for improving OCR datasets☆10Oct 4, 2023Updated 2 years ago
- OCR-D post-correction module based on weighted finite-state transducers☆11Jan 13, 2024Updated 2 years ago
- Tools for TICCL☆14Dec 12, 2025Updated 3 months ago
- This is an OCR solution for receipts, invoices, etc.☆20May 24, 2020Updated 5 years ago
- tesseractXplore a tesseract ease of use gui with full control☆28Nov 10, 2021Updated 4 years ago
- Convert between Tesseract hOCR and ALTO XML using XSL stylesheets☆59Sep 25, 2025Updated 5 months ago
- convert PubLayNet data into METS/PAGE-XML☆10Mar 17, 2020Updated 6 years ago
- DatasetImgLabeler is a image annotation tool for researchers to prepare datasets in ICDAR2015 format☆12Dec 7, 2019Updated 6 years ago
- ☆25Apr 22, 2018Updated 7 years ago
- Converters for various file formats used for representing OCR☆12Apr 30, 2025Updated 10 months ago
- ☆26Apr 18, 2020Updated 5 years ago
- ☆138Apr 4, 2023Updated 2 years ago
- Web based JavaScript GUI library for proofreading/editing hOCR☆101Sep 17, 2018Updated 7 years ago
- OCR-D post-correction with encoder-attention-decoder LSTMs☆13May 1, 2025Updated 10 months ago
- ☆22Dec 6, 2018Updated 7 years ago
- DocBankLoader is a dataset loader for DocBank, and can convert DocBank to the Object Detection models' format.☆25Mar 17, 2021Updated 5 years ago
- Japanese trained data of clstm☆15Jun 6, 2016Updated 9 years ago
- Development version of ndlstm, multidimensional LSTMs for TensorFlow☆19Feb 20, 2018Updated 8 years ago
- A simplified implementation of paper : Improved Localization Accuracy by LocNet for Faster R-CNN Based Text Detection☆29Jul 12, 2018Updated 7 years ago
- Obsolete repo, merged into eynollah☆12Sep 29, 2025Updated 5 months ago
- Crop And Splice Segments (of scanned pages)☆14Mar 11, 2019Updated 7 years ago
- Command line tool to convert page layout files to the latest PAGE XML format. It supports all previous versions of the PAGE format as wel…☆24Jan 30, 2021Updated 5 years ago