blues-lab / polipy
Library for scraping, parsing, and analyzing privacy policies.
☆14Updated 2 years ago
Alternatives and similar repositories for polipy:
Users that are interested in polipy are comparing it to the libraries listed below
- Toolchain to retrieve and parse privacy policies from websites as described in our paper "Unifying Privacy Policy Detection" by Henry Hos…☆16Updated 5 months ago
- Historical website privacy policies spanning over two decades.☆115Updated last year
- The code processes URLs in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private…☆23Updated 3 years ago
- Measure the readability of a given text using surface characteristics☆76Updated 2 weeks ago
- A crawler that uses OpenWPM.☆12Updated 3 years ago
- Materials to reproduce our findings in our stories, "Amazon Puts Its Own 'Brands' First Above Better-Rated Products" and "When Amazon Tak…☆68Updated 3 years ago
- A list of over 5000 US news domains and their social media accounts☆43Updated last year
- Extract networks of entities from journalistic reporting☆48Updated last year
- Tools for interactive visual exploration of semantic embeddings.☆30Updated 5 months ago
- A News Article Collection Library☆22Updated last year
- Pushshift Telegram Ingest☆84Updated 5 years ago
- an experimental implementation of Burrow's delta in Python 3☆20Updated 3 years ago
- A database of courts, tests and other experiments☆67Updated this week
- Tools to construct and process webgraphs from Common Crawl data☆85Updated 2 weeks ago
- A collection of code, data and information related to our audit of TikTok.☆21Updated this week
- ☆19Updated 2 years ago
- A database of court reporters, tests and other experiments☆100Updated this week
- Fast and robust date extraction from web pages, with Python or on the command-line☆122Updated last month
- Text and statistics utilities from Pew Research Center☆83Updated 3 years ago
- Trust and Safety Teaching Consortium☆63Updated 2 months ago
- Stanford Internet Observatory publications☆14Updated 3 years ago
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆15Updated last year
- spaCy extension for Visual Studio Code☆27Updated last year
- wrapper for the crossref events api☆18Updated last year
- Run information flow experiments on the Web☆39Updated 3 years ago
- Email Datasets can be found here☆59Updated 5 years ago
- an extensible tool to generate hyperlinks from legal citations☆33Updated 4 months ago
- A helper library full of URL-related heuristics.☆64Updated 4 months ago
- A Python Wrapper To Retrieve Data From The CrowdTangle API☆11Updated 8 months ago
- Simplified version of a common crawl fetcher☆13Updated this week