gojiplus / image-to-textLinks
Images of Text to Text: Call Tesseract from Python and OCR a directory of pdfs
☆15Updated 5 years ago
Alternatives and similar repositories for image-to-text
Users that are interested in image-to-text are comparing it to the libraries listed below
Sorting:
- Monitor datasets, gets alerts when something happens☆210Updated 6 years ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- Investigative tool for extracting relevant areas from many documents☆14Updated 9 years ago
- Python scraper to get weekly CDC flu surveillance data☆25Updated 10 years ago
- A repository of materials for a proposed class on automated story bots.☆49Updated 7 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆24Updated 8 years ago
- Ruby script to download bulk results from Archive.org's TV News database of closed captions☆14Updated 12 years ago
- Utilities for retrieving whitehouse.gov transcripts and matching news quotes to them☆16Updated 10 years ago
- A place to collect and share knowledge about liberating data from PDFs☆54Updated 3 years ago
- Twitter Bots!☆10Updated 11 years ago
- Near-duplicate detection tool☆24Updated 8 years ago
- Labeled segmentation for the document structure of printed books☆15Updated 8 years ago
- Language checker and hyphenator extension for LibreOffice☆12Updated 5 years ago
- Focused Crawler for VT's CTRNet☆10Updated 12 years ago
- The BITS Lab STACK tool for social media collection and analysis.☆39Updated 2 years ago
- Analysis related to article on FOIA Online Database.☆11Updated 8 years ago
- Twitter stream and social network crawling tools☆16Updated 8 years ago
- Pure python script that takes user query and summarizes news related to it.☆25Updated 3 years ago
- Newsclipse: The IDE for news production.☆91Updated 10 years ago
- A collection of introductions to various datasets, giving journalists some friendly background before they start doing analysis. Like "Hi…☆71Updated 10 years ago
- GenderTracker is a service that decomposes articles and computes various gender-related metrics based on the content.☆25Updated 11 years ago
- Quill Grammar App☆11Updated 7 years ago
- Code for extracting data from a large number of PDFs, particularly FCC political ad documents☆15Updated 7 years ago
- Fetch and parse the American Presidency Project's press-briefing and presidential-news-conference transcripts.☆11Updated 9 years ago
- Compare coverage across different media sources using the Juicer☆12Updated 9 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated last year
- Race and Gender of Criminals and Victims in Law and Order☆13Updated 4 years ago
- R Shiny App created to predict the success rate of Freedom of Information Act requests.☆16Updated 7 years ago
- This is a side project from 2008. This package contains a tool for automatically cropping and deskewing images of book pages captured by …☆28Updated 12 years ago
- Machine assisted dossiers☆19Updated 7 years ago