CitizensFoundation / pace-keyword-scannerLinks
CommonCrawl keyword scanner. Time for month of CC data on EC2 c5.18xlarge instance for hundreds of keywords takes about 3 hours. LLM (BERT) based 2nd level filtering. Developed with support from the EU and the Populism & Civic Engagement H2020 project.
☆17Updated 2 years ago
Alternatives and similar repositories for pace-keyword-scanner
Users that are interested in pace-keyword-scanner are comparing it to the libraries listed below
Sorting:
- Ontology dataset for open_numbers namespace☆10Updated 3 weeks ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆117Updated this week
- Easily build and maintain any kind of contract. Free and Open Source☆98Updated 8 years ago
- Java libraries to read and write real estate data in common formats (e.g. OpenImmo, ImmoXML, Kyero, Trovit, IDX)☆55Updated 2 years ago
- Real-Time Proxy & Web Scraping API☆24Updated 6 years ago
- Ricgraph - Research in context graph☆29Updated last week
- An open-source archive that gathers, saves, shares and analyzes news homepages☆149Updated last week
- A Command line interface that allows you to manage the back end of your self hosted typesense server. Builds on top of the typesense js l…☆16Updated 2 years ago
- keywords-extract - Command line tool extract keywords from any web page.☆61Updated 7 years ago
- Datasette plugin for uploading CSV files and converting them to database tables☆27Updated last month
- Fully customizable open source voice experience that can be hosted on any website.☆32Updated 3 years ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated last year
- A minimal client-side library to convert your vanilla URLs to deep links.☆19Updated 4 years ago
- A list of awesome browser extensions to help ith SEO and rank higher!☆24Updated 5 years ago
- ☆14Updated 3 years ago
- Code and data belonging to our CSCW 2019 paper: "Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites".☆135Updated 6 years ago
- Extract networks of entities from journalistic reporting☆49Updated 2 years ago
- A case management app built with Lowdefy.☆32Updated last year
- List of free and checked http, https, socks4 and socks5 proxies☆17Updated 2 weeks ago
- Inline-Editor.js Tool for Editor.js☆38Updated 2 years ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 7 years ago
- A curated list of promising Web Data Extractors resources☆29Updated 6 years ago
- Architectural design records, technical notes, and issues for the digital land project☆29Updated 5 months ago
- Convert excel sheet to SEO friendly sitemaps!☆12Updated 3 years ago
- The Misinformation Game is a social-media simulator built to study how people interact with information on social-media.☆31Updated last month
- Datasette plugin for rendering HTML based on JSON values☆29Updated 3 years ago
- RSS Reader API written in Django Rest☆46Updated last year
- Create a static website with Fly - HTML from the example☆21Updated last year
- collab-dev - Collaboration Metrics for Code Reviews☆19Updated 7 months ago
- Scrape HN to track links from specific domains☆69Updated this week