tmbarchive / docker-ocropusLinks
A small Docker built for the OCRopus OCR system.
☆20Updated 7 years ago
Alternatives and similar repositories for docker-ocropus
Users that are interested in docker-ocropus are comparing it to the libraries listed below
Sorting:
- Docker container to provide Apache Tika RESTful API☆41Updated 9 years ago
- Part of eMOP: Franken+ tool for creating font training for Tesseract OCR engine from page images.☆24Updated 9 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆39Updated last year
- An expandable and scalable OCR pipeline☆87Updated 7 years ago
- Experiments mining image collections using OpenCV☆64Updated 10 years ago
- Google Refine extension for adding columns (extending data) from DBpedia☆39Updated 11 years ago
- Monitor datasets, gets alerts when something happens☆210Updated 6 years ago
- Next generation OCR engine based on LSTMs.☆52Updated 7 years ago
- gathering point for open source OCR scripts and diffs☆43Updated 11 years ago
- Presentations, tutorials and data for the OCR workshop at LMU☆17Updated 8 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Updated 9 years ago
- Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.☆84Updated 9 years ago
- Visualization Storytelling Components☆32Updated 11 years ago
- Ergonomic line-by-line transcription of scanned text.☆53Updated 4 years ago
- A space for code and projects around analysing news content☆23Updated 7 years ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago
- This version of Rhizomer is archived, the current version is linked from:☆14Updated 7 years ago
- A workflow system for Natural Language Processing.☆21Updated 5 years ago
- Tribe extracts a network from an email mbox and writes it to a graphml file for visualization and analysis.☆79Updated 2 years ago
- A trend viewer written in Python/JavaScript☆21Updated 8 months ago
- Tools to manipulate and extract data from wikipedia dumps☆46Updated 12 years ago
- Information extraction and interactive visualization of textual datasets for investigative data-driven journalism and eDiscovery☆56Updated last year
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Updated 9 years ago
- Memory-based shallow parser for Python☆74Updated 5 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆61Updated 4 months ago
- RDFSpace constructs a vector space from any RDF dataset which can be used for computing similarities between resources in that dataset.☆40Updated 11 years ago
- Tools for text tokenization and encoding☆84Updated 3 years ago
- Tools for working with Optical Character Recognition output☆16Updated 11 years ago
- ApertureJS - an open, adaptable and extensible JavaScript visualization framework☆56Updated 9 years ago
- Data Pipes for CSV☆116Updated 2 years ago