facebookresearch / URL-SanitizationLinks
The code processes URLs in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private and/or sensitive data. It is part of the Facebook URL shares release effort, which is led by Election Research Commission (ERC).
☆23Updated 3 years ago
Alternatives and similar repositories for URL-Sanitization
Users that are interested in URL-Sanitization are comparing it to the libraries listed below
Sorting:
- COMM 4940: Governing Human-Algorithm Behavior☆21Updated last year
- ☆36Updated 6 years ago
- Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).☆14Updated 6 years ago
- Classify names by gender, U.S. ethnicity, or leaf nationality☆19Updated 6 years ago
- ☆73Updated 7 months ago
- A multi-modal Twitter dataset with 7.6M tweets and 25.6M retweets related to voter fraud claims.☆52Updated 3 years ago
- Analyzing MTurk demographics☆14Updated last year
- Tool for probabilistically linking the records of individual entities (e.g. people) within and across datasets☆117Updated 8 months ago
- The documentation and scripts for the Local News Dataset☆25Updated 3 years ago
- Notebooks and other course materials for Emory QTM 340 (Fall 2021)☆23Updated 2 years ago
- Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.☆14Updated 3 years ago
- 2020-election-night-model☆59Updated 4 years ago
- Fast, flexible name matching for large datasets☆72Updated 3 months ago
- smappdragon is a set of tools for working with twitter data.☆29Updated 6 years ago
- MPEDS Annotation Interface☆18Updated 2 years ago
- Machine-learning Protest Event Data System☆38Updated 9 months ago
- Materials to reproduce our findings in our stories, "Amazon Puts Its Own 'Brands' First Above Better-Rated Products" and "When Amazon Tak…☆70Updated 3 years ago
- Tracing policy ideas from think tanks and lobbyists through state legislative bills☆47Updated 8 years ago
- Closed Caption Transcripts of News Videos from archive.org 2014--2023☆49Updated 4 months ago
- Code supporting the dissertation "Agents in Conflict," George Mason University, 2016☆21Updated 9 years ago
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 5 years ago
- MoodCat😼 classifies the mood of English sentences.☆14Updated 3 years ago
- Tools for collecting social media data around focal events☆84Updated 3 years ago
- RECSM-UPF Summer School: Social Media and Big Data Research☆22Updated 8 years ago
- A Python module for clustering creators of social media content into networks☆73Updated 3 years ago
- Materials to reproduce findings in our stories, "Swinging the Vote?", and "To Gmail, Most Black Lives Matter Emails Are 'Promotions'"☆38Updated last year
- Datasets of the daily Twitter output of Congress.☆112Updated 2 years ago
- Text Thresher crowd sourced text annotator☆17Updated 7 years ago
- Twitter & Crowdtangle Data Access and Analysis Workshop for the Social Identity and Morality Lab☆12Updated 4 years ago
- Source code that reproduces the results from the paper "Who Let The Trolls Out? Towards Understanding State-Sponsored Trolls" (https://ar…☆20Updated 6 years ago