artunit / ossocr
gathering point for open source OCR scripts and diffs
☆43Updated 10 years ago
Related projects: ⓘ
- A small Docker built for the OCRopus OCR system.☆19Updated 6 years ago
- Experiments mining image collections using OpenCV☆64Updated 9 years ago
- Polytonic Greek OCR engine derived from Gamera and based on the work of Dalitz and Brandt☆32Updated 9 years ago
- Named-Entity Recognition extension for Google Refine / OpenRefine☆73Updated 7 years ago
- Named Entities Recognition Annotator Tool for Europeana Newspapers☆60Updated 6 years ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 7 years ago
- Part of eMOP: Franken+ tool for creating font training for Tesseract OCR engine from page images.☆24Updated 8 years ago
- Solrstrap is a Query-Result interface for Solr written in JavaScript, HTML and CSS☆86Updated 7 years ago
- Docker container to provide Apache Tika RESTful API☆40Updated 8 years ago
- View HOCR files with Mirador☆26Updated 6 years ago
- Training files produced for and by the Tesseract OCR engine for work on the Early Modern OCR Project (eMOP)☆36Updated 8 years ago
- Keeps a mirror of DBpedia live in sync☆26Updated 3 years ago
- A backend store for the Annotator☆176Updated 8 years ago
- Efficient indexing and retrieval of OCR bounding boxes in Solr☆22Updated 5 years ago
- [DEPRECATED] Scribe is a generalised transcription tool by the Zooniverse.☆49Updated 12 years ago
- KEA 5.0 (keyphrase extraction software), modified to be an XML-RPC service☆42Updated 13 years ago
- SKOS analysis for Elasticsearch☆54Updated 8 years ago
- Ergonomic line-by-line transcription of scanned text.☆47Updated 3 years ago
- A trend viewer written in Python/JavaScript☆20Updated 2 years ago
- A selection of test lines of several early printed books as well as the corresponding individual OCRopus models and mixed models.☆10Updated 6 years ago
- Code libraries for working with text content in ancient Greek☆9Updated 7 years ago
- ☆30Updated this week
- Presentations, tutorials and data for the OCR workshop at LMU☆17Updated 7 years ago
- ☆17Updated 9 years ago
- QA-tool for scans with corresponding ALTO-files☆22Updated last year
- Adds the ability to transcribe items using the Scripto library.☆17Updated last month
- JS for overlaying OCR on image using HOCR formatted HTML☆23Updated 8 years ago
- System for building, visualizing, and working with LDA topic models☆92Updated 2 months ago
- ☆184Updated 5 years ago
- Semiautomatic annotation editor for rich html editors.☆60Updated 11 years ago