datamade / pdf-textextractLinks
Docker Container for a Make-based, PDF extraction using OCR
☆13Updated last year
Alternatives and similar repositories for pdf-textextract
Users that are interested in pdf-textextract are comparing it to the libraries listed below
Sorting:
- yet another foia automation service☆44Updated 3 years ago
- semantic search for text in your spreadsheets☆56Updated 3 weeks ago
- JSON to geocode list of addresses in OpenRefine, using HERE and OpenStreetMap Nominatim APIs☆30Updated 10 months ago
- Docs and info from my 2018 workshop at the CAR conference☆29Updated 7 years ago
- Nicar ML/NLP workshop by J Kao☆19Updated 6 years ago
- A demo project and template repository showing how I use SpatiaLite with Datasette for quick spatial analysis.☆16Updated last year
- a python parser for the .fec file format☆46Updated 6 months ago
- POLITICO's system for managing civic data☆20Updated 2 years ago
- A new version of the cook county jail scraper, inspired by the Supreme Chi-Town Coding Crew☆23Updated 2 years ago
- A collection of cheat sheets for remembering common commands and tips for data journalism work.☆38Updated 2 years ago
- An easy-to-use point-and-click geocoder 🌍📍☆15Updated 2 years ago
- A Python wrapper for the OpenFEC API.☆28Updated 6 years ago
- Using Fly.io to generate map tiles☆19Updated 2 years ago
- Scrapes municipal data from Legistar websites☆48Updated last month
- ☆10Updated 6 years ago
- Collaborative data collection tool developed by the Associated Press☆109Updated 2 years ago
- A simple app to add OAuth-based authentication in front of an S3 bucket-based static website.☆11Updated 2 years ago
- Project generator for use with the datakit framework.☆28Updated last year
- GIS data for the U.S.-Mexico border fence (perhaps a wall in the future)☆28Updated 8 years ago
- Combine U.S. census data responsibly☆46Updated 2 years ago
- Voting Precinct Shapefiles in the United States☆100Updated 7 years ago
- NYC 311 complaints and demographic analysis☆42Updated 7 years ago
- A tutorial on optical character recognition using tesseract, ImageMagick and other open source tools☆69Updated 9 months ago
- Obtained in December 2014 through a Freedom of Information request☆15Updated 9 years ago
- transform a datapoint from a website into a CSV time-series dataset using the wayback machine☆12Updated 2 years ago
- Map locator image generator☆22Updated 9 years ago
- Teaching guide for a one-hour hands-on session at an IRE/NICAR conference on scraping web data using Python.☆27Updated last year
- Official repo documenting the closure of Sunlight Labs☆11Updated 9 years ago
- A build tool for data projects.☆49Updated 10 months ago
- ☆12Updated 6 years ago