brandonrobertz / autoscrape-py
An automated, programming-free web scraper for interactive sites
☆110Updated last year
Alternatives and similar repositories for autoscrape-py:
Users that are interested in autoscrape-py are comparing it to the libraries listed below
- A Python scraper for the Facebook Ad Library, using the official Facebook Ad Library API.☆119Updated 5 years ago
- How Quartz used AI to help reporters search the Mauritius Leaks☆47Updated 5 years ago
- ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of diff…☆88Updated 3 years ago
- Scrapers for U.S. county court sites.☆66Updated 2 years ago
- ⚡️ Enriches data, adding columns based on lookups to online services☆22Updated 2 months ago
- Run Overview on your own system☆124Updated 3 years ago
- Data model and processing tools for investigative entity data☆228Updated this week
- Command-line interface for downloading WARN Act notices of qualified plant closings and mass layoffs from state government websites☆31Updated last week
- ☆23Updated 9 years ago
- Loads raw FEC filings into a database☆22Updated 2 years ago
- 🎓 Practical beginner-level introductions to using different tools and technologies, with a focus on their application in the newsroom☆81Updated 2 years ago
- Scraper for Facebook's Archive of Ads with Political Content☆36Updated 6 years ago
- Teaching guide for a one-hour hands-on session at an IRE/NICAR conference on using pandas to analyze data.☆20Updated 2 months ago
- List of publicly available, free/open source and open access resources for learning and doing data journalism.☆45Updated last year
- Module on both the MA Data Journalism and MA Multiplatform and Mobile Journalism at Birmingham City University☆28Updated 2 months ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 4 years ago
- Public client for consuming content from the Media Cloud Online News Archive & Directory.☆72Updated 4 months ago
- 🔎 Finds fuzzy matches between CSV files☆189Updated last month
- Notebooks and files for the Python for Journalists course on Datajournalism.com☆60Updated 4 years ago
- Collector for Facebook's Political Ad API☆31Updated 2 years ago
- Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.☆151Updated 3 months ago
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆23Updated last year
- API client for Aleph, supports bulk entity and document upload.☆28Updated 6 months ago
- Docker Container for a Make-based, PDF extraction using OCR☆12Updated 9 months ago
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆13Updated 2 months ago
- Mecodify tool for twitter data analysis and visualisation☆42Updated last year
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- ☆12Updated last year
- An ICIJ app to conduct data validation and cleaning.☆20Updated this week