CitizensFoundation / pace-keyword-scanner
CommonCrawl keyword scanner. Time for month of CC data on EC2 c5.18xlarge instance for hundreds of keywords takes about 3 hours. LLM (BERT) based 2nd level filtering. Developed with support from the EU and the Populism & Civic Engagement H2020 project.
☆13Updated last year
Alternatives and similar repositories for pace-keyword-scanner:
Users that are interested in pace-keyword-scanner are comparing it to the libraries listed below
- Track changes to GraphQL APIs by git scraping their schemas☆27Updated this week
- A visualisation library for beneficial ownership structures☆21Updated this week
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆15Updated last year
- Datasette showing global power plant data from https://github.com/wri/global-power-plant-database☆17Updated 2 years ago
- A demonstration transnational register of beneficial ownership data from the UK, Denmark, Slovakia and Armenia☆17Updated 3 months ago
- Datasette plugin for publishing data using Vercel☆44Updated 2 years ago
- A Flat Data GitHub Action demo repo☆35Updated 2 weeks ago
- An introduction to free, automated web scraping with GitHub’s powerful new Actions framework.☆28Updated 5 months ago
- ☆14Updated 3 years ago
- Datasette enrichment for analyzing row data using OpenAI's GPT models☆19Updated 9 months ago
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 4 years ago
- Pull out versions of specific files from a gitscraping repo into individual files.☆15Updated 3 years ago
- Examples of bad data, especially from government.☆22Updated 6 months ago
- Dead simple cron service for making HTTP calls on a regular schedule.☆14Updated 4 years ago
- ☆26Updated 4 years ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- A Google Trends Analytics Package☆13Updated 8 months ago
- A collaborative collection of datasets that are common to use within "Follow the Money" investigations with european scope☆13Updated 8 months ago
- Interactive visual tool for the demonstration of topic evolution☆40Updated 4 years ago
- ☆24Updated last year
- Datasette plugin providing a UI for executing SQL writes against the database☆10Updated 5 months ago
- 🗳️ Monitor your country, your city council or your organization promises and objectives☆14Updated 3 years ago
- ☆28Updated 10 years ago
- d3 plugin to create a temporal network visualization☆18Updated 2 years ago
- H2O is a web app for creating and reading open educational resources, primarily in the legal field☆37Updated this week
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 4 years ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated 4 months ago
- Inspect Element is a practitioner's guide to auditing algorithms and data-driven investigations☆29Updated last month
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆79Updated this week
- Inspect a URL and estimate if it contains a news story☆39Updated 2 months ago