facebookresearch / URL-Sanitization
The code processes URLs in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private and/or sensitive data. It is part of the Facebook URL shares release effort, which is led by Election Research Commission (ERC).
☆23Updated 3 years ago
Alternatives and similar repositories for URL-Sanitization:
Users that are interested in URL-Sanitization are comparing it to the libraries listed below
- Materials to reproduce our findings in our stories, "Amazon Puts Its Own 'Brands' First Above Better-Rated Products" and "When Amazon Tak…☆69Updated 3 years ago
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 4 years ago
- 2020-election-night-model☆58Updated 4 years ago
- ☆70Updated 3 months ago
- Tools for collecting social media data around focal events☆84Updated 3 years ago
- A set of jupyter notebooks demonstrating how to use the Media Cloud API.☆37Updated last year
- ☆36Updated 6 years ago
- The documentation and scripts for the Local News Dataset☆25Updated 3 years ago
- Given a set of URLs, this packages detects coordinated link sharing behavior on social media and outputs the network of entities that per…☆75Updated 8 months ago
- A maximum-strength name parser for record linkage.☆36Updated 3 weeks ago
- Tool for probabilistically linking the records of individual entities (e.g. people) within and across datasets☆112Updated 4 months ago
- Tracing policy ideas from think tanks and lobbyists through state legislative bills☆45Updated 8 years ago
- A curated list of awesome data sources related to elections, electoral reforms, and democratic political systems.☆78Updated 3 years ago
- smappdragon is a set of tools for working with twitter data.☆29Updated 6 years ago
- Public client for consuming content from the Media Cloud Online News Archive & Directory.☆72Updated 4 months ago
- Closed Caption Transcripts of News Videos from archive.org 2014--2023☆47Updated last week
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated last year
- Hierarchical clustering of 2011-2022 Congress Twitter☆29Updated 2 years ago
- Investigating how COVID-19 shaped Anti-Asian Climate☆12Updated 3 years ago
- An alpha project combining beneficial ownership and contracting data☆13Updated 3 years ago
- MPEDS Annotation Interface☆18Updated 2 years ago
- Materials to reproduce findings in our stories, "Swinging the Vote?", and "To Gmail, Most Black Lives Matter Emails Are 'Promotions'"☆38Updated 10 months ago
- An R package for accessing the Facebook Ad Library API☆74Updated last year
- A multi-modal Twitter dataset with 7.6M tweets and 25.6M retweets related to voter fraud claims.☆53Updated 3 years ago
- A Python module for clustering creators of social media content into networks☆74Updated 3 years ago
- COMM 4940: Governing Human-Algorithm Behavior☆21Updated 10 months ago
- Research-grade URL expansion for Python.☆27Updated 6 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 5 months ago
- PageOneX. Analyzing front pages☆52Updated 5 months ago
- OCCRP and media partners collected data on COVID-19 related spending from across Europe from February to October 2020☆13Updated 4 years ago