jplusplus / statscraper
A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.
☆13Updated 3 weeks ago
Alternatives and similar repositories for statscraper:
Users that are interested in statscraper are comparing it to the libraries listed below
- API client for Aleph, supports bulk entity and document upload.☆28Updated 5 months ago
- A financial disclosure data extraction tool.☆14Updated last year
- scraper for facebook, gab, google and tiktok☆22Updated 8 months ago
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 4 years ago
- An alpha project combining beneficial ownership and contracting data☆13Updated 3 years ago
- Service for creating Twitter datasets for research and archiving.☆26Updated 2 years ago
- A Python library for defining rule-based overrides on messy data☆13Updated 4 months ago
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 4 years ago
- GenderTracker is a service that decomposes articles and computes various gender-related metrics based on the content.☆25Updated 11 years ago
- ☆12Updated 5 years ago
- Ask questions about government data.☆37Updated 6 years ago
- R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance☆12Updated 8 years ago
- a general list of resources and articles for people interested in getting into data journalism☆16Updated last year
- A helper library full of URL-related heuristics.☆66Updated 5 months ago
- OpenRefine for Social Science Data☆24Updated this week
- Converter for ICIJ Offshore Leaks data into FollowTheMoney format☆12Updated 3 years ago
- Deduplicate and parse list of `dirty names'☆19Updated 4 years ago
- Research-grade URL expansion for Python.☆26Updated 6 years ago
- 📕 Writing tests, the DataMade way☆16Updated 4 years ago
- DEPRECATED. Desktop graph visualization application☆50Updated 2 years ago
- Extract list of results from search engines pages as CSV with a bookmarklet directly within the browser☆19Updated last month
- 📒 Analyzing Data, the DataMade Way☆37Updated 4 years ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- The core of sunlightlabs' Data Commons project. Includes the Transparency Data site and the APIs that power TransparencyData.com and Infl…☆38Updated 8 years ago
- Scraping Assisted by Learning☆35Updated this week
- Docker Container for a Make-based, PDF extraction using OCR☆12Updated 7 months ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated last year
- A maximum-strength name parser for record linkage.☆36Updated last month
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆15Updated last year