jplusplus / statscraper
A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.
☆13Updated last year
Related projects ⓘ
Alternatives and complementary repositories for statscraper
- API client for Aleph, supports bulk entity and document upload.☆28Updated last month
- scraper for facebook, gab, google and tiktok☆22Updated 4 months ago
- An alpha project combining beneficial ownership and contracting data☆13Updated 3 years ago
- A financial disclosure data extraction tool.☆13Updated last year
- Extract networks of entities from journalistic reporting☆47Updated last year
- Ask questions about government data.☆37Updated 5 years ago
- DEPRECATED. Desktop graph visualization application☆50Updated 2 years ago
- ☆12Updated 5 years ago
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 4 years ago
- Frontend interface for Datashare, a self-hosted search engine for documents.☆32Updated this week
- Service for creating Twitter datasets for research and archiving.☆26Updated last year
- A maximum-strength name parser for record linkage.☆34Updated 3 months ago
- Converter for ICIJ Offshore Leaks data into FollowTheMoney format☆12Updated 2 years ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- A Python library for defining rule-based overrides on messy data☆12Updated this week
- Parse Popolo JSON data and navigate it with Python☆15Updated 4 years ago
- A helper library full of URL-related heuristics.☆64Updated last month
- A scraper focused on organizational Github accounts and their members.☆40Updated 2 years ago
- The core of sunlightlabs' Data Commons project. Includes the Transparency Data site and the APIs that power TransparencyData.com and Infl…☆38Updated 8 years ago
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 4 years ago
- Scraping Assisted by Learning☆35Updated 2 months ago
- Extract list of results from search engines pages as CSV with a bookmarklet directly within the browser☆19Updated this week
- All the files and documentation necessary to reuse, remix and translate A Field Guide to "Fake News" and Other Information Disorders.☆61Updated 4 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- Monitor datasets, gets alerts when something happens☆211Updated 5 years ago
- Tracking the history of the FARA data from https://www.justice.gov/nsd-fara☆14Updated last year
- Grabbing all news.☆62Updated 4 years ago
- Word Religion Projections (2010-2050)☆13Updated 3 weeks ago