Python tools for Tesseract OCR training
☆26May 2, 2022Updated 3 years ago
Alternatives and similar repositories for pytesstrain
Users that are interested in pytesstrain are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Dec 10, 2025Updated 4 months ago
- Check your modified Ground Truth files with visual support!☆10Jan 31, 2024Updated 2 years ago
- Crop And Splice Segments (of scanned pages)☆14Mar 11, 2019Updated 7 years ago
- Small collection of PAGE XML related scripts used at the ZPD Würzburg☆12Aug 2, 2024Updated last year
- ☆14Jul 11, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Some bits of javascript to transcribe scanned pages using PageXML☆17Mar 18, 2024Updated 2 years ago
- An extensible viewer for OCR-D mets.xml files☆23May 30, 2024Updated last year
- An approximate nearest-neighbor search for text reuse.☆12Oct 5, 2020Updated 5 years ago
- NLP-helper for OCR-ed pages in PAGE XML format☆10Dec 6, 2024Updated last year
- ☆13Aug 5, 2025Updated 8 months ago
- Source code and documentation of a specialized computer assisted synthesis planning (CASP) tool used for the deconstruction of ring syste…☆12May 25, 2020Updated 5 years ago
- NewsEye / READ OCR training dataset from Austrian Newspapers (1864–1911)☆18Oct 31, 2025Updated 5 months ago
- Shan Natural Language Processing tools inspired by PythaiNLP☆14Mar 1, 2026Updated last month
- Cosine Similary Search in ElasticSearch + FAISS GPU☆12Mar 24, 2022Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This repository provides German documentation relating to the text recognition and transcription platform eScriptorium. The documentation…☆15Dec 6, 2025Updated 4 months ago
- An implementation of Roger Sayle's smizip algorithm for short string compression, like SMILES strings☆13Feb 1, 2025Updated last year
- Reichsanzeiger-NLP: NER/NEL corpus for the German historical newspaper "Deutscher Reichsanzeiger und Preußischer Staatsanzeiger" (1819–19…☆16Oct 18, 2024Updated last year
- Images of example pages from Transkribus model training sets to make it easier to find a match.☆15Jan 25, 2022Updated 4 years ago
- Speech Emotion Recognition using PyTorch sponsored by AIS and VISTEC-DEPA AIResearch Institute Thailand.☆22Nov 6, 2021Updated 4 years ago
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Dec 17, 2021Updated 4 years ago
- Single source publishing for vertical writing☆11Mar 15, 2021Updated 5 years ago
- Reaction Analysis through Imaging of Chemical Units☆16Dec 5, 2025Updated 4 months ago
- Page-wise text recognition with lower-supervision line data models☆52Mar 30, 2026Updated 2 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- daily SARS-CoV2 Re Values for select countries. See https://ibz-shiny.ethz.ch/covid-19-re/ for interactive visualisations.☆13Mar 23, 2023Updated 3 years ago
- Thai Law Dataset (Act of Parliament)☆23Jul 21, 2021Updated 4 years ago
- Open Source Implementation of the Unique Ring Families Algorithm (Cheminformatics)☆18Aug 5, 2024Updated last year
- ☆11Dec 23, 2020Updated 5 years ago
- Kong OAuth SSO Integration☆16Aug 23, 2017Updated 8 years ago
- presentations for busy messy hackers☆36Jan 21, 2014Updated 12 years ago
- Voice Model Creator - using a context-specific grammar☆13May 7, 2018Updated 7 years ago
- Orchestrate web crawlers to create structured datasets from multiple data sources with YAML configs.☆16Dec 8, 2022Updated 3 years ago
- A Python library to add reconstructed pronunciations of Middle Chinese on Chinese texts☆11Mar 13, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- tesseractXplore a tesseract ease of use gui with full control☆28Nov 10, 2021Updated 4 years ago
- Tesseract tessdata downloader from GitHub repositories☆11Sep 17, 2021Updated 4 years ago
- Next generation OCR engine based on LSTMs.☆51Apr 8, 2018Updated 8 years ago
- Automation of NOAA satellite reception☆14May 14, 2025Updated 11 months ago
- Tutorial on how to create metrics dashboards like the THOR Dashboard☆14Mar 8, 2017Updated 9 years ago
- Reads a Bibliography-file (.json) with academic references and renders it as footnotes at the end of the page. Allows for a variety of st…☆12Jul 3, 2023Updated 2 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Mar 31, 2025Updated last year