CitizensFoundation / pace-keyword-scannerLinks
CommonCrawl keyword scanner. Time for month of CC data on EC2 c5.18xlarge instance for hundreds of keywords takes about 3 hours. LLM (BERT) based 2nd level filtering. Developed with support from the EU and the Populism & Civic Engagement H2020 project.
☆15Updated 2 years ago
Alternatives and similar repositories for pace-keyword-scanner
Users that are interested in pace-keyword-scanner are comparing it to the libraries listed below
Sorting:
- Real-Time Proxy & Web Scraping API☆24Updated 5 years ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆97Updated this week
- Ontology dataset for open_numbers namespace☆10Updated 10 months ago
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆25Updated 5 years ago
- Inline-Editor.js Tool for Editor.js☆39Updated 2 years ago
- Ricgraph - Research in context graph☆30Updated this week
- A case management app built with Lowdefy.☆32Updated last year
- An open-source archive that gathers, saves, shares and analyzes news homepages☆144Updated last week
- keywords-extract - Command line tool extract keywords from any web page.☆63Updated 6 years ago
- Scrape various open data directories to create an index of what's available out there☆37Updated 7 months ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated 11 months ago
- ☆12Updated last year
- RSS Reader API written in Django Rest☆44Updated last year
- A minimal client-side library to convert your vanilla URLs to deep links.☆20Updated 4 years ago
- ☆30Updated 11 years ago
- A helper library full of URL-related heuristics.☆70Updated 2 weeks ago
- A curated list of awesome resources on crowdsourcing, human computation, and online behavioral experiments.☆49Updated 7 years ago
- A curated list of promising Web Data Extractors resources☆29Updated 5 years ago
- Browser version of Hyphe (WIP)☆31Updated 4 months ago
- Tools to construct and process Common Crawl webgraphs☆96Updated 3 weeks ago
- Coordinated vulnerability disclosure (CVD) for security discoveries, bug reporting, breach analysis, etc.☆17Updated 5 months ago
- Contact your Government Representative Send a Letter or Fax in <30 Seconds☆13Updated 2 years ago
- Easily build and maintain any kind of contract. Free and Open Source☆96Updated 7 years ago
- Frontend interface for Datashare, a self-hosted search engine for documents.☆38Updated this week
- Create a static website with Fly - HTML from the example☆21Updated last year
- Extract networks of entities from journalistic reporting☆48Updated 2 years ago
- ☆26Updated 4 years ago
- Python, Javascript, and Rust libraries for the Spider Cloud API.☆19Updated 2 weeks ago
- Containerized workflow automation tool☆21Updated this week
- Datasette plugin for rendering HTML based on JSON values☆28Updated 3 years ago