gojiplus / image-to-textLinks
Images of Text to Text: Call Tesseract from Python and OCR a directory of pdfs
☆15Updated 5 years ago
Alternatives and similar repositories for image-to-text
Users that are interested in image-to-text are comparing it to the libraries listed below
Sorting:
- In this project, there are two major tasks: text data processing and text categorization. In text data processing, we have done tokenizat…☆8Updated 8 years ago
- Quill Grammar App☆11Updated 7 years ago
- A place to collect and share knowledge about liberating data from PDFs☆54Updated 3 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆24Updated 8 years ago
- Focused Crawler for VT's CTRNet☆10Updated 12 years ago
- Monitor datasets, gets alerts when something happens☆210Updated 6 years ago
- Classifying the content of domains☆56Updated 2 years ago
- Labeled segmentation for the document structure of printed books☆15Updated 8 years ago
- Utilities for retrieving whitehouse.gov transcripts and matching news quotes to them☆16Updated 10 years ago
- Ruby script to download bulk results from Archive.org's TV News database of closed captions☆14Updated 12 years ago
- Workshop bringing together individuals interested in developing curriculum, workflows, and tools to strengthen reproducibility in researc…☆33Updated 10 years ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- PETRARCH actor, agent and verb dictionaries☆22Updated 7 years ago
- GenderTracker is a service that decomposes articles and computes various gender-related metrics based on the content.☆25Updated 11 years ago
- Python natural language processing work☆29Updated 15 years ago
- Code supporting the dissertation "Agents in Conflict," George Mason University, 2016☆21Updated 9 years ago
- Amsterdam Content Analysis Toolkit☆46Updated 3 years ago
- Language checker and hyphenator extension for LibreOffice☆12Updated 5 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated last year
- Investigative tool for extracting relevant areas from many documents☆14Updated 9 years ago
- Parser for KAF NAF files written in Python☆16Updated 4 years ago
- (Python) Execute tesseract OCR on a multi-page PDF.☆18Updated 2 years ago
- Various functions to make bag-of-words approaches to text analysis more user-friendly☆24Updated 8 years ago
- The Data Journalism Handbook was born at a 48 hour workshop at MozFest 2011 in London. It subsequently spilled over into an international…☆32Updated 13 years ago
- ScraperWiki Python library for scraping and saving data☆159Updated 2 years ago
- bigram / trigram analysis of wikipedia; mainly mutual info☆22Updated 13 years ago
- Code, data, and paper for Academia.edu citation advantage analysis☆31Updated 9 years ago
- Command-line tool to extract a ranked list of relevant keywords from a corpus with the option of using either topic modeling or tf-idf sc…☆40Updated 8 years ago
- R Shiny App created to predict the success rate of Freedom of Information Act requests.☆16Updated 7 years ago
- Pure python script that takes user query and summarizes news related to it.☆25Updated 3 years ago