ocropus / ocropus4-evalLinks
Tools for evaluating OCR performance relative to ground truth.
☆10Updated last year
Alternatives and similar repositories for ocropus4-eval
Users that are interested in ocropus4-eval are comparing it to the libraries listed below
Sorting:
- Post-processing OCR errors with seq2seq models☆28Updated 4 years ago
- DFKI Layout Detection for OCR-D☆47Updated last month
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Updated 2 months ago
- ☆67Updated last year
- Open Access PDF harvester☆40Updated last year
- A search engine built on the Unpaywall database☆21Updated last year
- OCR-D post-correction module based on weighted finite-state transducers☆11Updated last year
- ☆25Updated last year
- Flow Chart Image-to-Code Generation☆33Updated last year
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated last month
- Layout Analysis Dataset with Segmonto (LADaS)☆21Updated last month
- Ergonomic line-by-line transcription of scanned text.☆52Updated 4 years ago
- Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python☆18Updated 2 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆39Updated last year
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆25Updated 2 years ago
- Small python package to measure OCR quality and other related metrics.☆23Updated last year
- A browser extension providing Open Access bibliographical services☆17Updated 2 years ago
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆70Updated last week
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆13Updated last year
- Translate files using Argos Translate☆20Updated 2 months ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆53Updated 2 years ago
- OCR & Ground Truth Resources☆76Updated 3 years ago
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆18Updated 10 months ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆51Updated 3 months ago
- Homebase of the IPTC EXTRA project about rule-based text categorization☆13Updated 8 years ago
- Logical structure analysis for visually structured documents☆90Updated 2 years ago
- Scrollership through 20m pubmed abstracts.☆26Updated 2 years ago
- Chrome Extension for exploring Hugging Face datasets 🔎☆50Updated 9 months ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated 2 months ago
- Corpus Build OCR platform☆8Updated 2 years ago