datamade / pdf-textextract
Docker Container for a Make-based, PDF extraction using OCR
☆11Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for pdf-textextract
- A collection of cheat sheets for remembering common commands and tips for data journalism work.☆38Updated last year
- Data and scripts relating to the publishing of the House expenditure reports, and hopefully the Senate's in future.☆24Updated 3 years ago
- A new version of the cook county jail scraper, inspired by the Supreme Chi-Town Coding Crew☆23Updated last year
- Intro to Python for data analysis (NICAR 2019)☆17Updated 5 years ago
- how I FOIA (and maybe how you can too!)☆21Updated 6 years ago
- JSON to geocode list of addresses in Open Refine, using Bing and OpenStreetMap Nominatim APIs☆28Updated last year
- Teaching guide for a one-hour hands-on session at an IRE/NICAR conference on scraping web data using Python.☆18Updated 11 months ago
- this is the code that goes along with the AJC story at https://www.ajc.com/news/state--regional-govt--politics/precinct-closures-harm-vot…☆13Updated 4 years ago
- How Quartz used AI to help reporters search the Mauritius Leaks☆45Updated 5 years ago
- Command-line tool for exploring the PAC donor-recipient relationship☆54Updated 9 years ago
- A Node module to parse raw FEC electronic filings, inspired by Fech.☆17Updated last year
- Loads raw FEC filings into a database☆21Updated last year
- A quick repo with basic command line commands, plus a very brief CSVKit run through.☆31Updated 4 years ago
- For students of https://projects.propublica.org/graphics/ida-propublica-data-institute☆26Updated 2 years ago
- yet another foia automation service☆41Updated 2 years ago
- Scrapes municipal data from Legistar websites☆42Updated 5 months ago
- Tracing policy ideas from think tanks and lobbyists through state legislative bills☆42Updated 8 years ago
- POLITICO's system for managing civic data☆20Updated last year
- Interactive and searchable House staffer directory, based on House disbursement data.☆26Updated 8 months ago
- Materials for the ProPublica Data Institute 2019☆43Updated 5 years ago
- Materials for the PostgreSQL hands-on class at NICAR 2018 in Chicago.☆10Updated 6 years ago
- A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCES…☆64Updated last month
- Teaching guide for a one-hour hands-on session at an IRE/NICAR conference on using pandas to analyze data.☆17Updated 4 months ago
- Docs and info from my 2018 workshop at the CAR conference☆29Updated 6 years ago
- a python parser for the .fec file format☆44Updated last year
- Archive of political ad data from the Federal Communications Commission☆20Updated 7 years ago
- Project generator for use with the datakit framework.☆27Updated 8 months ago