jplusplus / statscraperLinks
A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.
☆13Updated 3 months ago
Alternatives and similar repositories for statscraper
Users that are interested in statscraper are comparing it to the libraries listed below
Sorting:
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 4 years ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated 8 months ago
- A financial disclosure data extraction tool.☆16Updated last year
- scraper for facebook, gab, google and tiktok☆21Updated this week
- Service for creating Twitter datasets for research and archiving.☆26Updated 2 years ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- A tool to allow US addresses to be geocoded/georeferenced easily, without using Python or the command line or paid services or anything.☆18Updated 2 years ago
- Ask questions about government data.☆37Updated 6 years ago
- An alpha project combining beneficial ownership and contracting data☆13Updated 4 years ago
- ☆11Updated 6 years ago
- Docker Container for a Make-based, PDF extraction using OCR☆12Updated 10 months ago
- R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance☆12Updated 8 years ago
- A Python library for defining rule-based overrides on messy data☆16Updated 2 months ago
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 5 years ago
- All the files and documentation necessary to reuse, remix and translate A Field Guide to "Fake News" and Other Information Disorders.☆62Updated 4 years ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- How Quartz used AI to help reporters search the Mauritius Leaks☆47Updated 5 years ago
- Extract list of results from search engines pages as CSV with a bookmarklet directly within the browser☆24Updated 2 months ago
- The core of sunlightlabs' Data Commons project. Includes the Transparency Data site and the APIs that power TransparencyData.com and Infl…☆38Updated 8 years ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated 2 years ago
- Datakit plugin to help manage Github integration on data projects.☆12Updated 2 years ago
- Uses NLP methods to parse and classify contracts from The City of New Orleans☆10Updated 10 years ago
- European Parliament Open Data // Twitter☆20Updated 2 years ago
- Monitor datasets, gets alerts when something happens☆210Updated 6 years ago
- Python and pandas tools to perform various analyses on different types of word lists☆16Updated 10 years ago
- Interactive and searchable House staffer directory, based on House disbursement data.☆27Updated last year
- Mecodify tool for twitter data analysis and visualisation☆42Updated last year
- A collaborative collection of structured datasets and document collections that are common to use within "Follow the Money" investigation…☆13Updated last week
- ⚡️ Enriches data, adding columns based on lookups to online services☆22Updated this week