CitizensFoundation / pace-keyword-scannerLinks
CommonCrawl keyword scanner. Time for month of CC data on EC2 c5.18xlarge instance for hundreds of keywords takes about 3 hours. LLM (BERT) based 2nd level filtering. Developed with support from the EU and the Populism & Civic Engagement H2020 project.
☆15Updated 2 years ago
Alternatives and similar repositories for pace-keyword-scanner
Users that are interested in pace-keyword-scanner are comparing it to the libraries listed below
Sorting:
- Track changes to GraphQL APIs by git scraping their schemas☆29Updated 4 months ago
- A curated list of promising Web Data Extractors resources☆29Updated 5 years ago
- Real-Time Proxy & Web Scraping API☆24Updated 5 years ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆94Updated this week
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆25Updated 4 years ago
- A case management app built with Lowdefy.☆32Updated last year
- An open-source archive that gathers, saves, shares and analyzes news homepages☆144Updated 3 weeks ago
- Java libraries to read and write real estate data in common formats (e.g. OpenImmo, ImmoXML, Kyero, Trovit, IDX)☆55Updated last year
- Fully customizable open source voice experience that can be hosted on any website.☆33Updated 3 years ago
- Variety/Strain Database☆19Updated 2 months ago
- A list of awesome browser extensions to help ith SEO and rank higher!☆25Updated 4 years ago
- A curated list of awesome resources on crowdsourcing, human computation, and online behavioral experiments.☆49Updated 7 years ago
- Inline-Editor.js Tool for Editor.js☆39Updated 2 years ago
- Dockerized workflow automation tool☆21Updated 2 weeks ago
- A helper library full of URL-related heuristics.☆70Updated 2 months ago
- Datasette plugin for rendering HTML based on JSON values☆27Updated 3 years ago
- Easily build and maintain any kind of contract. Free and Open Source☆96Updated 7 years ago
- ☆14Updated 3 years ago
- Scrape HN to track links from specific domains☆63Updated this week
- A minimal client-side library to convert your vanilla URLs to deep links.☆20Updated 4 years ago
- Spin up a fully configured Ubuntu/Debian-based web server in under 10 minutes with Nginx (w/ HTTPS), PHP FPM, Postfix, OpenDKIM, MySQL/Ma…☆21Updated 3 years ago
- keywords-extract - Command line tool extract keywords from any web page.☆63Updated 6 years ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 7 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated last year
- Coordinated vulnerability disclosure (CVD) for security discoveries, bug reporting, breach analysis, etc.☆17Updated 4 months ago
- Web-based application to manage documents, images, videos and geodata.☆33Updated 2 years ago
- RSS Reader API written in Django Rest☆44Updated last year
- The Toolkit API, app, and browser extension. Start preserving now.☆47Updated last week
- A helper to compare and identify similar keywords using PHP.☆10Updated 2 years ago
- Scrape data from BuiltWith.com☆18Updated 7 years ago