Python tools for Tesseract OCR training
☆26May 2, 2022Updated 3 years ago
Alternatives and similar repositories for pytesstrain
Users that are interested in pytesstrain are comparing it to the libraries listed below
Sorting:
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Dec 10, 2025Updated 2 months ago
- Small collection of PAGE XML related scripts used at the ZPD Würzburg☆12Aug 2, 2024Updated last year
- Cosine Similary Search in ElasticSearch + FAISS GPU☆12Mar 24, 2022Updated 3 years ago
- Some bits of javascript to transcribe scanned pages using PageXML☆17Mar 18, 2024Updated last year
- ☆14Jul 11, 2022Updated 3 years ago
- Kong OAuth SSO Integration☆16Aug 23, 2017Updated 8 years ago
- ☆20Aug 18, 2019Updated 6 years ago
- Recognition Models for Kraken and CLSTM☆16Aug 21, 2019Updated 6 years ago
- Orchestrate web crawlers to create structured datasets from multiple data sources with YAML configs.☆15Dec 8, 2022Updated 3 years ago
- Tutorial on how to create metrics dashboards like the THOR Dashboard☆14Mar 8, 2017Updated 8 years ago
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Apr 30, 2025Updated 10 months ago
- A compound splitter based on the semantic regularities in the vector space of word embeddings.☆16Mar 15, 2017Updated 8 years ago
- Layout analysis to find layout elements in documents (similar to P2PaLA)☆20Feb 27, 2026Updated last week
- An extensible viewer for OCR-D mets.xml files☆23May 30, 2024Updated last year
- Next generation OCR engine based on LSTMs.☆51Apr 8, 2018Updated 7 years ago
- Code accompanying our paper "One Knowledge Graph to Rule them All? Analyzing the Differences between DBpedia, YAGO, Wikidata & co."☆26Jul 18, 2017Updated 8 years ago
- a Deep Learning based Speller☆28Jan 21, 2019Updated 7 years ago
- Finetuned traineddata files for Arabic☆31Feb 28, 2019Updated 7 years ago
- Page-wise text recognition with lower-supervision line data models☆51Feb 27, 2026Updated last week
- python library☆12Nov 25, 2025Updated 3 months ago
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Oct 31, 2025Updated 4 months ago
- Text Re-use Alignment Visualization☆38Nov 8, 2017Updated 8 years ago
- This repository is about how to build an SQLite version of the Arabic WordNet database.☆10Mar 19, 2019Updated 6 years ago
- Input pipelines for large scale, sharded training of deep learning models.☆40Jun 18, 2019Updated 6 years ago
- Super simple, zero config options, <2kb declarative tooltip library with no dependencies.☆17Jun 2, 2023Updated 2 years ago
- Automation of NOAA satellite reception☆13May 14, 2025Updated 9 months ago
- ☆25Dec 15, 2025Updated 2 months ago
- MACE is A C++ Engine☆10Dec 9, 2019Updated 6 years ago
- A PGN reviewer and publishing platform for the Kindle, allowing users to review annotated Chess games from their digital e-reader.☆23Apr 25, 2011Updated 14 years ago
- NLP-helper for OCR-ed pages in PAGE XML format☆10Dec 6, 2024Updated last year
- Data visualization workshop☆11May 12, 2020Updated 5 years ago
- golang package to provide lightweight internal pub/sub for goroutines☆29Jan 23, 2014Updated 12 years ago
- Project to digitize avant-garde periodicals☆12May 13, 2022Updated 3 years ago
- Knock your images before you get stressed.☆11Jan 9, 2022Updated 4 years ago
- A collection of OCR'd and machine-corrected Greek texts. This base repository contains Git submodules for the different works and an inve…☆11Nov 18, 2014Updated 11 years ago
- Sonnet WebApp (Full Version)☆10Aug 31, 2019Updated 6 years ago
- Tools and Examples for Computational Text Analysis for Assyriologists.☆11Sep 3, 2018Updated 7 years ago
- Faster access to Tesseract-OCR from Python☆13Jun 8, 2021Updated 4 years ago
- K-RET: Knowledgeable Biomedical Relation Extraction System☆10Feb 22, 2025Updated last year