CopenhagenCityArchives / CorrectOCR
Machine Learning-assisted correction of OCR errors in historical corpora
☆9Updated 2 months ago
Alternatives and similar repositories for CorrectOCR:
Users that are interested in CorrectOCR are comparing it to the libraries listed below
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated 4 months ago
- Master repository which includes most other OCR-D repositories as submodules☆72Updated 3 months ago
- ☆15Updated 3 years ago
- In-browser OCR of Ancient Greek and Latin☆25Updated 2 months ago
- PAGE XML format collection for document image page content and more☆67Updated 3 years ago
- ☆20Updated 5 years ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆52Updated last year
- An OCR evaluation tool☆64Updated last month
- Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)☆17Updated 4 months ago
- OCR-D-compliant page segmentation☆67Updated 4 months ago
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆36Updated last year
- Layout Analysis Dataset with Segmonto (LADaS)☆19Updated last month
- ☆10Updated 5 years ago
- Document Image Binarization☆75Updated 3 months ago
- Given a text, wrap it into phrases and send them to Yandex's search engine. If it yields a "did you mean:", substitute the original phras…☆11Updated 6 years ago
- Wrapper around pixel classifier☆9Updated 2 years ago
- tesseractXplore a tesseract ease of use gui with full control☆21Updated 3 years ago
- Ergonomic line-by-line transcription of scanned text.☆50Updated 4 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Updated 9 months ago
- Neural Elastic Inference and Search☆19Updated 5 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆37Updated 3 years ago
- Discourse Analysis Tool Suite☆18Updated this week
- Repository hosting the common code for the entity-fishing clients☆9Updated 7 months ago
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Updated 3 years ago
- Python 3 library for processing historical English☆64Updated 5 months ago
- A suite of batches and tools for OCR tasks.☆71Updated last year
- ☆11Updated 3 years ago
- Code examples for Google Natural Language API.☆13Updated 5 years ago
- Using Conditional Random Fields for segmenting Latin words written in scriptio continua☆10Updated 6 years ago
- Named entity recognition for the legal domain☆41Updated 3 years ago