soodoku / image-to-text
Images of Text to Text: Call Tesseract from Python and OCR a directory of pdfs
☆15Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for image-to-text
- (Python) Execute tesseract OCR on a multi-page PDF.☆18Updated last year
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago
- Legislative data from the congress repository☆19Updated 11 years ago
- Capstone GRS Website☆7Updated 5 years ago
- An online reference for data journalism☆25Updated 10 years ago
- ☆11Updated 9 years ago
- stoplists for African languages generated from the ASP corpus☆14Updated 8 years ago
- RESTful API around the PETRARCH coding software☆10Updated 3 years ago
- utility to fetch provenance information from Internet Archive's Wayback Machine☆13Updated 2 years ago
- A guide to using The State Decoded, whether deploying a site or using the data from one.☆13Updated 7 years ago
- Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…☆12Updated 10 years ago
- c-span opened captions node buffer server + google docs apps script☆8Updated 5 years ago
- A glossary for the United States.☆42Updated 9 years ago
- ☆18Updated 8 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆23Updated 8 years ago
- Query Wikipedia articles☆18Updated 2 years ago
- A collection of all the court seals we can muster.☆19Updated last month
- ☆36Updated last year
- Scraper built with Scrapy.☆14Updated 3 months ago
- Examples of bad data, especially from government.☆22Updated 3 months ago
- Colors in Library of Congress digital images.☆32Updated 6 years ago
- Open source image editor for windows 10. Can be controlled by voice commands and Cortana.☆15Updated 7 years ago
- A collection of data about how federal agencies divide their agency coverage geospatially☆11Updated 8 years ago
- A scraper focused on organizational Github accounts and their members.☆40Updated 2 years ago
- A Data Parsing/Data Manipulation Tool Supporting Digitization Projects and Other Data Analysis Projects☆47Updated 5 years ago
- Responsively embed DocumentCloud notes.☆21Updated 6 years ago
- Charts for the Consumer Financial Protection Bureau☆12Updated 7 months ago
- Search the internet from your terminal. Speed read your results. Terminal nirvana.☆20Updated 3 years ago
- Trading Consequences data and code☆15Updated 9 years ago