qedsoftware / multipage-ocrLinks
(Python) Execute tesseract OCR on a multi-page PDF.
β18Updated 2 years ago
Alternatives and similar repositories for multipage-ocr
Users that are interested in multipage-ocr are comparing it to the libraries listed below
Sorting:
- π Prototype Orange widgets β only for the brave.β12Updated 7 months ago
- Images of Text to Text: Call Tesseract from Python and OCR a directory of pdfsβ15Updated 5 years ago
- Convert a corpus of PDF to clean text files on a distributed architectureβ39Updated last year
- Python wrapper for xpdfβ19Updated 5 years ago
- Tribe extracts a network from an email mbox and writes it to a graphml file for visualization and analysis.β79Updated 2 years ago
- Information extraction and interactive visualization of textual datasets for investigative data-driven journalism and eDiscoveryβ56Updated last year
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations aβ¦β99Updated 2 years ago
- Using ML to extract campaign finance data from messy forms for journalismβ76Updated 3 years ago
- Convert text from PDF to XML.β45Updated 6 years ago
- Tools for analyzing the Hillary Clinton emailsβ13Updated 9 years ago
- Ergonomic line-by-line transcription of scanned text.β53Updated 4 years ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better undersβ¦β45Updated 3 years ago
- A set of workflows for corpus building through OCR, post-correction and normalisationβ49Updated 2 years ago
- πGUI for training spaCy modelsβ55Updated 4 years ago
- Next generation OCR engine based on LSTMs.β52Updated 7 years ago
- PDF analysis. Convert contents of PDF to a JSON-style python dictionary.β31Updated 2 years ago
- Data notification service: subscribe to keywords and get notified whenever an open data sources mentions that keyword.β24Updated 11 years ago
- Trying to generate name synonyms from wikidataβ32Updated 5 years ago
- Use visual programming to build data tables based on text data within the Orange data mining software environmentβ29Updated last month
- Detecting Mines in the Democratic Republic of Congo via Satellite Imageryβ12Updated 2 years ago
- PST extraction and analytic pipelineβ37Updated 7 years ago
- Installer for Thymeflow, a personal knowledge management system.β33Updated 7 years ago
- β72Updated 6 months ago
- Keyword Extraction system using Brown Clustering - (This version is trained to extract keywords from job listings)β18Updated 10 years ago
- The OpenSextant Gazetteer is a collection of world-wide place name dataβ12Updated 7 years ago
- π π Educational widgets for machine learning and data mining in Orange 3.β28Updated last year
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visualiβ¦β84Updated 5 years ago
- Quickly analyze and explore email with advanced analytics and visualization.β56Updated 3 years ago
- Take streaming tweets, extract hashtags & usernames, create graph, export graphml for Gephi visualisationβ38Updated 12 years ago
- π Text Mining add-on for Orange3β132Updated last month