ocropus / ocropus4-eval
Tools for evaluating OCR performance relative to ground truth.
☆10Updated last year
Alternatives and similar repositories for ocropus4-eval:
Users that are interested in ocropus4-eval are comparing it to the libraries listed below
- Ergonomic line-by-line transcription of scanned text.☆51Updated 4 years ago
- DFKI Layout Detection for OCR-D☆47Updated this week
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated 3 weeks ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Updated 11 months ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated last year
- This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified an…☆22Updated 4 years ago
- OCR-D post-correction module based on weighted finite-state transducers☆11Updated last year
- ☆67Updated last year
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆53Updated last year
- An OCR evaluation tool☆65Updated last month
- Open Access PDF harvester☆39Updated 10 months ago
- Recognize text using Calamari OCR and the OCR-D framework☆14Updated 5 months ago
- Keyword spaCy is a spaCy pipeline component for extracting keywords from text using cosine similarity.☆11Updated last year
- ☆10Updated 3 years ago
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Updated 3 years ago
- Glyph Miner, a system for extracting glyphs from early typeset prints☆34Updated 8 years ago
- Post-processing OCR errors with seq2seq models☆28Updated 4 years ago
- ☆12Updated 11 months ago
- tesseractXplore a tesseract ease of use gui with full control☆22Updated 3 years ago
- visualization using Wikidata data☆7Updated 8 months ago
- ☆27Updated last year
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆15Updated 7 months ago
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- Conversions between various OCR formats☆74Updated last year
- A browser extension providing Open Access bibliographical services☆17Updated 2 years ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆49Updated last week
- OCR & Ground Truth Resources☆74Updated 2 years ago
- Layout Analysis Dataset with Segmonto (LADaS)☆20Updated last month
- Segmenting a given document using recursive xy-cut algorithm.☆12Updated 6 years ago
- User contributed (non Google) OCR models for Tesseract☆24Updated 5 months ago