OCR-D/ocrd_tesserocr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OCR-D/ocrd_tesserocr)

OCR-D / ocrd_tesserocr

Run tesseract with the tesserocr bindings with @OCR-D's interfaces

☆39

Alternatives and similar repositories for ocrd_tesserocr

Users that are interested in ocrd_tesserocr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

UB-Mannheim / GTCheck
View on GitHub
Check your modified Ground Truth files with visual support!
☆10Jan 31, 2024Updated 2 years ago
OCR-D / ocrd_pagetopdf
View on GitHub
OCR-D wrapper for prima-pagetopdf
☆10Oct 30, 2025Updated 8 months ago
cisocrgroup / ocrd_cis
View on GitHub
OCR-D python tools
☆33Aug 16, 2024Updated last year
OCR-D / ocrd_all
View on GitHub
Master repository which includes most other OCR-D repositories as submodules
☆73Jul 4, 2025Updated last year
cisocrgroup / Resources
View on GitHub
Manuals, lexica, OCR test data for PoCoTo and the profiler
☆15Jul 2, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
OCR-D / core
View on GitHub
Collection of OCR-related python tools and wrappers from @OCR-D
☆135Updated this week
OCR-D / ocrd-website
View on GitHub
☆24Jun 9, 2026Updated last month
Doreenruirui / okralact
View on GitHub
A repository for online OCRD training infrastructure.
☆13Aug 20, 2020Updated 5 years ago
UB-Mannheim / Fibeln
View on GitHub
Transkriptionen von Fibeln (19. Jahrhundert)
☆11Oct 31, 2025Updated 8 months ago
PRImA-Research-Lab / prima-page-to-pdf
View on GitHub
Java command line tool to convert PAGE XML files with layout and text content to PDF
☆10Apr 27, 2020Updated 6 years ago
PRImA-Research-Lab / prima-core-libs
View on GitHub
Core libraries by the PRImA Research Lab
☆16Jul 30, 2024Updated last year
jbaiter / archiscribe
View on GitHub
Web application for transcribing OCR ground truth from Archive.org
☆18Feb 22, 2018Updated 8 years ago
PRImA-Research-Lab / prima-page-converter
View on GitHub
Command line tool to convert page layout files to the latest PAGE XML format. It supports all previous versions of the PAGE format as wel…
☆25Jan 30, 2021Updated 5 years ago
ulb-sachsen-anhalt / ulb-zeitungsprojekt-hp1
View on GitHub
Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"
☆12Dec 17, 2021Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
OCR4all / LAREX
View on GitHub
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.
☆198Updated this week
tesseract-ocr / tessdata_contrib
View on GitHub
User contributed (non Google) OCR models for Tesseract
☆33Jun 12, 2026Updated last month
ASVLeipzig / cor-asv-ann
View on GitHub
OCR-D post-correction with encoder-attention-decoder LSTMs
☆13May 1, 2025Updated last year
altomator / EN-data_mining
View on GitHub
Data Mining Historical Newspaper Metadata (METS/ALTO formats)
☆25Feb 6, 2026Updated 5 months ago
OCR-D / page-to-alto
View on GitHub
Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)
☆17Jun 5, 2026Updated last month
PRImA-Research-Lab / PAGE-XML
View on GitHub
PAGE XML format collection for document image page content and more
☆71Jan 16, 2026Updated 6 months ago
kennethleungty / OCR-Metrics-CER-WER
View on GitHub
Sample implementation of OCR metrics (CER, WER) calculation with TesseractOCR and fastwer
☆30Jun 25, 2021Updated 5 years ago
bibliocoll / JournalTouch
View on GitHub
JournalTouch provides a touch-optimized interface for browsing current journal tables of contents in Responsive Design. Fun!
☆14May 27, 2019Updated 7 years ago
stefanklut / laypa
View on GitHub
Layout analysis to find layout elements in documents (similar to P2PaLA)
☆22May 20, 2026Updated 2 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
OCR-D / ocrd_kraken
View on GitHub
Wrapper for the kraken OCR engine
☆12Jul 12, 2025Updated last year
VRI-UFPR / ocrd-gbn
View on GitHub
OCR-D compliant toolset for optical layout recognition on historical german-language documents published in Brazil
☆11Sep 24, 2021Updated 4 years ago
OCR-D / spec
View on GitHub
Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)
☆17Sep 18, 2025Updated 10 months ago
OCR-D / ocrd_anybaseocr
View on GitHub
DFKI Layout Detection for OCR-D
☆47May 1, 2025Updated last year
qurator-spk / eynollah
View on GitHub
Document Layout Analysis
☆408Updated this week
ASVLeipzig / cor-asv-fst
View on GitHub
OCR-D post-correction module based on weighted finite-state transducers
☆11Jan 13, 2024Updated 2 years ago
andbue / nashi
View on GitHub
Some bits of javascript to transcribe scanned pages using PageXML
☆17May 27, 2026Updated last month
wikimedia / wikimedia-ocr
View on GitHub
This repository is now at https://gitlab.wikimedia.org/toolforge-repos/ocr
☆17May 19, 2026Updated 2 months ago
dot-legal / reference
View on GitHub
Write beautifully short contract. https://reference.legal/ is a referenceable clause library to standardize contracts once and for all.
☆13Jul 12, 2022Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
jze / ocropus-model_fraktur
View on GitHub
OCRopus model for Gothic print (Fraktur)
☆19Feb 16, 2020Updated 6 years ago
qurator-spk / sbb_textline_detection
View on GitHub
Detect textlines in document images
☆90May 27, 2024Updated 2 years ago
wincentbalin / pytesstrain
View on GitHub
Python tools for Tesseract OCR training
☆26May 2, 2022Updated 4 years ago
mscarey / legislice
View on GitHub
API client for fetching and comparing passages from legislation
☆14Jun 29, 2026Updated 3 weeks ago
filak / hOCR-to-ALTO
View on GitHub
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
☆60Mar 20, 2026Updated 4 months ago
martinreynaert / TICCL
View on GitHub
Text-Induced Corpus Clean-up
☆20Jun 20, 2023Updated 3 years ago
JSv4 / AtticusClassifier
View on GitHub
Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus
☆14Jan 2, 2021Updated 5 years ago