tberg12 / ocular
Ocular is a state-of-the-art historical OCR system.
☆250Updated 3 months ago
Related projects: ⓘ
- A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.☆179Updated last month
- OCR evaluation brought to you by University of Alicante☆66Updated 2 years ago
- An expandable and scalable OCR pipeline☆86Updated 6 years ago
- Repository collecting all the submodules for the new PyTorch-based OCR System.☆142Updated 3 years ago
- Toolbox for OCR post-correction☆122Updated 5 years ago
- Update of the ISRI Analytic Tools for OCR Evaluation with UTF-8 support☆55Updated 3 years ago
- Named Entities Recognition Annotator Tool for Europeana Newspapers☆60Updated 6 years ago
- PAGE XML format collection for document image page content and more☆62Updated 3 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆176Updated last month
- Training files produced for and by the Tesseract OCR engine for work on the Early Modern OCR Project (eMOP)☆36Updated 8 years ago
- The CIS OCR PostCorrectionTool☆39Updated last year
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆363Updated last month
- A suite of batches and tools for OCR tasks.☆71Updated last year
- Web based JavaScript GUI library for proofreading/editing hOCR☆89Updated 6 years ago
- Detect and align similar passages☆86Updated 2 weeks ago
- Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.☆34Updated last year
- ALTO XML schema - latest and all former versions☆51Updated 2 months ago
- Generic framework for historical document processing☆369Updated 3 years ago
- ☆27Updated 11 months ago
- Collection of OCR-related python tools and wrappers from @OCR-D☆118Updated this week
- Provides OCR (Optical Character Recognition) services through web applications☆231Updated 7 months ago
- A deep learning toolkit specialized for handwritten document analysis☆202Updated 2 weeks ago
- Semantic Annotation Without the Pointy Brackets☆151Updated 7 months ago
- Master repository which includes most other OCR-D repositories as submodules☆71Updated last month
- Mechanical Turk on your own machine.☆206Updated 2 years ago
- Working with hOCR in Javascript☆119Updated last year
- Command line tool to extract figures, tables, and captions from scholarly documents in PDF form.☆130Updated 6 years ago
- Simple app for visual editing of Page XML files☆30Updated 8 months ago
- High-level build project for all LAPDF-Text submodules☆103Updated 9 years ago
- Page to PAGE Layout Analysis Tool☆190Updated 2 years ago