soodoku / image-to-text
Images of Text to Text: Call Tesseract from Python and OCR a directory of pdfs
☆15Updated 5 years ago
Alternatives and similar repositories for image-to-text:
Users that are interested in image-to-text are comparing it to the libraries listed below
- (Python) Execute tesseract OCR on a multi-page PDF.☆18Updated last year
- An online reference for data journalism☆25Updated 10 years ago
- Search the internet from your terminal. Speed read your results. Terminal nirvana.☆20Updated 4 years ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆23Updated 8 years ago
- Global Data Journalists Directory☆10Updated 6 years ago
- Examples of bad data, especially from government.☆22Updated 5 months ago
- Scraper built with Scrapy.☆14Updated 5 months ago
- A scraper focused on organizational Github accounts and their members.☆40Updated 2 years ago
- A glossary for the United States.☆42Updated 9 years ago
- Extract images from PDF documents. Works on multiple and single PDF files☆14Updated 7 years ago
- South Africa's by-laws in XML format☆18Updated 6 years ago
- A guide to using The State Decoded, whether deploying a site or using the data from one.☆13Updated 7 years ago
- Monitor datasets, gets alerts when something happens☆210Updated 6 years ago
- c-span opened captions node buffer server + google docs apps script☆8Updated 5 years ago
- Trough: Big data, small databases.☆40Updated 5 months ago
- Extract list of results from search engines pages as CSV with a bookmarklet directly within the browser☆19Updated last week
- An index of formal complaint systems☆17Updated 6 years ago
- stoplists for African languages generated from the ASP corpus☆14Updated 9 years ago
- Legislative data from the congress repository☆19Updated 11 years ago
- List of easy American-English words: The New Dale-Chall (1995)☆32Updated 2 years ago
- A home to keep my web monkeys...☆39Updated 4 years ago
- Command line tool to extract borders of GeoJSON Polygons into a non-overlapping set of LineString's☆12Updated 5 years ago
- A tile grid map component built on top of d3Kit☆11Updated 2 years ago
- Ask questions about government data.☆37Updated 6 years ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago
- computational sociology of philosophy☆13Updated 8 years ago
- ☆36Updated last year
- Fiscal datapackage visualisations and dashboard☆10Updated 2 years ago
- ☆18Updated 9 years ago