blues-lab / polipyLinks
Library for scraping, parsing, and analyzing privacy policies.
☆16Updated 2 years ago
Alternatives and similar repositories for polipy
Users that are interested in polipy are comparing it to the libraries listed below
Sorting:
- Toolchain to retrieve and parse privacy policies from websites as described in our paper "Unifying Privacy Policy Detection" by Henry Hos…☆17Updated 4 months ago
- Historical website privacy policies spanning over two decades.☆130Updated last year
- Run information flow experiments on the Web☆39Updated 4 years ago
- The code processes URLs in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private…☆23Updated 3 years ago
- A list of over 5000 US news domains and their social media accounts☆44Updated 2 years ago
- An automated, programming-free web scraper for interactive sites☆111Updated 2 years ago
- Tools to construct and process Common Crawl webgraphs☆93Updated last week
- Code release for our WWW 2019 paper entitled "ShadowBlock: A Lightweight and Stealthy Adblocking Browser".☆20Updated 6 years ago
- A database of courts, tests and other experiments☆90Updated this week
- Pushshift Telegram Ingest☆86Updated 5 years ago
- A browser extension to collect social media data with.☆279Updated 2 months ago
- Frontend component for Hoaxy, a tool to visualize the spread of claims and fact checking☆72Updated 2 years ago
- A verification “Swiss army knife” helping journalists, fact-checkers, and human rights defenders to save time and be more efficient in th…☆40Updated this week
- Fast and robust date extraction from web pages, with Python or on the command-line☆138Updated last month
- A webmining CLI tool & library for python.☆334Updated last week
- Project repository for "Evaluating the persuasive influence of political microtargeting with large language models" by Kobi Hackenburg an…☆10Updated last year
- The AI Incident Database seeks to identify, define, and catalog artificial intelligence incidents.☆203Updated last week
- A Stylometry Library for Python☆145Updated 2 years ago
- Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online me…☆285Updated last year
- A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.☆453Updated last year
- The 4CAT Capture and Analysis Toolkit provides modular data capture & analysis for a variety of social media platforms.☆326Updated last week
- Common crawl extractor☆78Updated last year
- Comprehensive database of ratings for 11k news domains☆28Updated last year
- Trust and Safety Teaching Consortium☆71Updated 3 months ago
- ☆73Updated last week
- Code and data belonging to our CSCW 2019 paper: "Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites".☆132Updated 6 years ago
- CAP database scripts.☆191Updated 11 months ago
- ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of diff…☆88Updated 3 years ago
- A helper library full of URL-related heuristics.☆70Updated 2 months ago
- Statistics of Common Crawl monthly archives mined from URL index files☆188Updated 2 weeks ago