qedsoftware / multipage-ocrLinks
(Python) Execute tesseract OCR on a multi-page PDF.
☆19Updated 2 years ago
Alternatives and similar repositories for multipage-ocr
Users that are interested in multipage-ocr are comparing it to the libraries listed below
Sorting:
- Binary Python bindings for poppler utils for content extraction☆42Updated 4 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated last year
- Backend for social-media-picture-explorer-ui, a tool for using deep learning to interactively explore social media☆53Updated 7 years ago
- 🍊 Prototype Orange widgets — only for the brave.☆12Updated 3 weeks ago
- Tools for analyzing the Hillary Clinton emails☆13Updated 9 years ago
- (BROKEN, help wanted)☆15Updated 9 years ago
- A toolkit for clustering web pages based on various similarity measures.☆34Updated 3 years ago
- Installer for Thymeflow, a personal knowledge management system.☆34Updated 7 years ago
- Use visual programming to build data tables based on text data within the Orange data mining software environment☆29Updated 3 months ago
- Images of Text to Text: Call Tesseract from Python and OCR a directory of pdfs☆15Updated 5 years ago
- Source code for the Twitter Hybrid Sentiment Classifier used in Semeval 2014 competition. (Sentiment Analysis system)☆13Updated 11 years ago
- Tribe extracts a network from an email mbox and writes it to a graphml file for visualization and analysis.☆79Updated 2 years ago
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visuali…☆86Updated 5 years ago
- LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilatio…☆68Updated 2 years ago
- A toolkit for mapping networks of political and economic influence through diverse types of entities and their relations. Accessible at h…☆189Updated 4 years ago
- A platform for collecting, analyzing, and visualizing social media data.☆12Updated 4 years ago
- Monitor datasets, gets alerts when something happens☆210Updated 6 years ago
- Convert text from PDF to XML.☆45Updated 6 years ago
- Orange Data Mining Homepage☆17Updated 5 years ago
- General Architecture for Text Engineering☆49Updated 9 years ago
- Demo of the Newspaper article extraction library.☆29Updated 10 years ago
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆16Updated 10 years ago
- A pipeline for detecting novel information about entities from a stream of text, updating a knowledge base about the entities, and genera…☆32Updated 6 years ago
- ☆49Updated 11 years ago
- see also section scraping on custom levels of depth☆87Updated 7 months ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- Date parsing and normalization utilities for Python.☆22Updated last year
- Analyze the nouns and entities in a rss feed☆21Updated 4 years ago
- A place to collect and share knowledge about liberating data from PDFs☆55Updated 3 years ago
- Ideas for (tech) stuff to research, build or work on.☆50Updated 8 months ago