appeler / clean-namesLinks
Deduplicate and parse list of `dirty names'
☆23Updated 4 years ago
Alternatives and similar repositories for clean-names
Users that are interested in clean-names are comparing it to the libraries listed below
Sorting:
- A maximum-strength name parser for record linkage.☆37Updated last month
- Inspect a URL and estimate if it contains a news story☆39Updated 7 months ago
- Investigative tool for extracting relevant areas from many documents☆14Updated 9 years ago
- America's most comprehensive dictionary of campaign finance jargon. A free resource created by and for data journalists.☆17Updated 2 weeks ago
- Data and scripts relating to the publishing of the House expenditure reports, and hopefully the Senate's in future.☆24Updated 4 years ago
- The core of sunlightlabs' Data Commons project. Includes the Transparency Data site and the APIs that power TransparencyData.com and Infl…☆38Updated 8 years ago
- ☆23Updated 9 years ago
- Loads raw FEC filings into a database☆22Updated 2 years ago
- R Shiny App created to predict the success rate of Freedom of Information Act requests.☆16Updated 7 years ago
- Code for extracting data from a large number of PDFs, particularly FCC political ad documents☆15Updated 7 years ago
- https://www.washingtonpost.com/graphics/2020/investigations/helicopter-protests-washington-dc-national-guard/☆23Updated 5 years ago
- Machine assisted dossiers☆19Updated 7 years ago
- Interactive and searchable House staffer directory, based on House disbursement data.☆27Updated last year
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated 9 months ago
- DocumentCloud's back end source code - Please report bugs, issues and feature requests to info@documentcloud.org☆40Updated this week
- A financial disclosure data extraction tool.☆16Updated last year
- PageOneX. Analyzing front pages☆52Updated 7 months ago
- Demonstration project for building out a data news rig.☆10Updated 3 years ago
- Basic cookiecutter template for Python projects☆21Updated 9 months ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- Application for https://www.r-consortium.org/projects/call-for-proposals☆13Updated 6 years ago
- An open-source archive that gathers, saves, shares and analyzes news homepages☆139Updated 2 weeks ago
- How can we improve name matching in screening tools?☆12Updated 5 months ago
- Project generator for use with the datakit framework.☆28Updated last year
- Easily download U.S. census maps☆33Updated 2 years ago
- Archive of political ad data from the Federal Communications Commission☆20Updated 7 years ago
- Named-Entity Recognition extension for OpenRefine☆29Updated 2 years ago
- CNN Transcripts 2000--2025☆23Updated 2 months ago
- Data on the 11,500+ athletes and 306 events at the Rio Olympics. Includes medals tallies☆33Updated 4 years ago
- Command line tool to convert spreadsheets to databases, made for the UK's Office for National Statistics.☆80Updated last year