qedsoftware / multipage-ocrLinks
(Python) Execute tesseract OCR on a multi-page PDF.
☆18Updated 2 years ago
Alternatives and similar repositories for multipage-ocr
Users that are interested in multipage-ocr are comparing it to the libraries listed below
Sorting:
- Use visual programming to build data tables based on text data within the Orange data mining software environment☆29Updated 2 months ago
- A toolkit for mapping networks of political and economic influence through diverse types of entities and their relations. Accessible at h…☆189Updated 4 years ago
- 🍊 Prototype Orange widgets — only for the brave.☆12Updated 8 months ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆39Updated last year
- Monitor datasets, gets alerts when something happens☆210Updated 6 years ago
- A collection of ipython/jupyter notebooks☆16Updated 6 years ago
- Binary Python bindings for poppler utils for content extraction☆42Updated 4 years ago
- Tribe extracts a network from an email mbox and writes it to a graphml file for visualization and analysis.☆79Updated 2 years ago
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visuali…☆84Updated 5 years ago
- Ergonomic line-by-line transcription of scanned text.☆53Updated 4 years ago
- Orange Data Mining Homepage☆16Updated 5 years ago
- Easily display Zotero items on a webpage☆32Updated 2 years ago
- Images of Text to Text: Call Tesseract from Python and OCR a directory of pdfs☆15Updated 5 years ago
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆99Updated 2 years ago
- Information extraction and interactive visualization of textual datasets for investigative data-driven journalism and eDiscovery☆56Updated last year
- Now included in rigour☆151Updated last week
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- Installer for Thymeflow, a personal knowledge management system.☆33Updated 7 years ago
- Named-Entity Recognition extension for Google Refine / OpenRefine☆73Updated 8 years ago
- Palladio Application☆42Updated 3 years ago
- Atom/Electron Application for calling PanDoc Converter with Shell Commands on Linux Windows Mac☆17Updated 4 years ago
- Take streaming tweets, extract hashtags & usernames, create graph, export graphml for Gephi visualisation☆38Updated 12 years ago
- Copyleaks finds plagiarism online using copyright infringement detection technology. Find those who have used your content with Copyleaks…☆89Updated this week
- 🍊 Text Mining add-on for Orange3☆132Updated last month
- A Twitter data collection and appraisal application.☆51Updated 2 years ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated 9 months ago
- Simple taxonomy management tool and document classifier.☆56Updated 5 years ago
- A simple platform for managing structured data.☆27Updated 3 years ago
- A PDF classifier ensemble with REST API service☆23Updated 4 years ago
- A place to collect and share knowledge about liberating data from PDFs☆54Updated 3 years ago