CitizensFoundation / pace-keyword-scannerLinks
CommonCrawl keyword scanner. Time for month of CC data on EC2 c5.18xlarge instance for hundreds of keywords takes about 3 hours. LLM (BERT) based 2nd level filtering. Developed with support from the EU and the Populism & Civic Engagement H2020 project.
☆16Updated 2 years ago
Alternatives and similar repositories for pace-keyword-scanner
Users that are interested in pace-keyword-scanner are comparing it to the libraries listed below
Sorting:
- Track changes to GraphQL APIs by git scraping their schemas☆30Updated 7 months ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆111Updated this week
- Ricgraph - Research in context graph☆29Updated last week
- Ontology dataset for open_numbers namespace☆10Updated last week
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆24Updated 5 years ago
- A minimal client-side library to convert your vanilla URLs to deep links.☆20Updated 4 years ago
- Public API client for GETTR, a "non-bias [sic] social network," designed for data archival and analysis.☆96Updated 4 months ago
- RSS Reader API written in Django Rest☆45Updated last year
- keywords-extract - Command line tool extract keywords from any web page.☆61Updated 7 years ago
- An open-source archive that gathers, saves, shares and analyzes news homepages☆146Updated 2 weeks ago
- A case management app built with Lowdefy.☆32Updated last year
- The Toolkit API, app, and browser extension. Start preserving now.☆47Updated last week
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated last year
- A helper library full of URL-related heuristics.☆73Updated last month
- Java libraries to read and write real estate data in common formats (e.g. OpenImmo, ImmoXML, Kyero, Trovit, IDX)☆55Updated last year
- Easily build and maintain any kind of contract. Free and Open Source☆96Updated 8 years ago
- ☆16Updated this week
- Convert excel sheet to SEO friendly sitemaps!☆12Updated 2 years ago
- A list of awesome browser extensions to help ith SEO and rank higher!☆24Updated 5 years ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 7 years ago
- A curated list of promising Web Data Extractors resources☆29Updated 5 years ago
- Datasette enrichment for analyzing row data using OpenAI's GPT models☆21Updated last year
- VFRAME: Visual Forensics and Metadata Extraction☆74Updated 2 years ago
- A curated list of awesome resources on crowdsourcing, human computation, and online behavioral experiments.☆49Updated 7 years ago
- The Misinformation Game is a social-media simulator built to study how people interact with information on social-media.☆31Updated last week
- Datasette plugin for publishing data using Vercel☆45Updated 3 years ago
- Now included in rigour☆153Updated 2 months ago
- Citadel: Enterprise Search☆14Updated 2 years ago
- 🗳️ Monitor your country, your city council or your organization promises and objectives☆14Updated 4 years ago
- ✏️ Free open source Web User Interface for OhMyForm ⛺☆61Updated last year