CitizensFoundation / pace-keyword-scanner
CommonCrawl keyword scanner. Time for month of CC data on EC2 c5.18xlarge instance for hundreds of keywords takes about 3 hours. LLM (BERT) based 2nd level filtering. Developed with support from the EU and the Populism & Civic Engagement H2020 project.
☆14Updated last year
Alternatives and similar repositories for pace-keyword-scanner:
Users that are interested in pace-keyword-scanner are comparing it to the libraries listed below
- Track changes to GraphQL APIs by git scraping their schemas☆28Updated this week
- Datasette enrichment for analyzing row data using OpenAI's GPT models☆19Updated 10 months ago
- A Google Trends Analytics Package☆13Updated 9 months ago
- Datasette showing global power plant data from https://github.com/wri/global-power-plant-database☆17Updated 2 years ago
- ☆12Updated last year
- An introduction to free, automated web scraping with GitHub’s powerful new Actions framework.☆28Updated 7 months ago
- Vector Embedding Markup Language - markup language designed specifically for annotating and structuring data related to vector embeddings…☆12Updated 11 months ago
- Dead simple cron service for making HTTP calls on a regular schedule.☆14Updated 4 years ago
- H2O is a web app for creating and reading open educational resources, primarily in the legal field☆38Updated last month
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 4 years ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 6 years ago
- Everyting you need to know about Aquila Network Neural Search Ecosystem. Official repositories, client libraries, ecosystem projects, boi…☆32Updated 3 years ago
- A visualisation library for beneficial ownership structures☆21Updated last month
- ☆11Updated 4 months ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆80Updated this week
- Dataset files for the Open Data on GitHub paper☆26Updated 2 weeks ago
- Ontology dataset for open_numbers namespace☆10Updated 4 months ago
- A demonstration transnational register of beneficial ownership data from the UK, Denmark, Slovakia and Armenia☆17Updated 4 months ago
- ☆14Updated 3 years ago
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆23Updated 4 years ago
- GitHub statistics☆12Updated 2 years ago
- Java libraries to read and write real estate data in common formats (e.g. OpenImmo, ImmoXML, Kyero, Trovit, IDX)☆52Updated last year
- Repository to allow collaboration between Cycle Labs Cloud community in support of the community.☆9Updated 3 years ago
- Fully customizable open source voice experience that can be hosted on any website.☆33Updated 2 years ago
- ☆24Updated last year
- Datasette plugin for uploading CSV files and converting them to database tables☆26Updated 11 months ago
- Data Catalog Specification (Schema and Protocol)☆21Updated 6 years ago
- ☆26Updated 4 years ago
- Scrape various open data directories to create an index of what's available out there☆36Updated last month
- 🗳️ Monitor your country, your city council or your organization promises and objectives☆14Updated 3 years ago