jsoma / fuzzy_pandas
Fuzzy matches and merging of datasets in pandas using csvmatch
☆75Updated 4 years ago
Alternatives and similar repositories for fuzzy_pandas:
Users that are interested in fuzzy_pandas are comparing it to the libraries listed below
- Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.☆108Updated 4 months ago
- A simple Python wrapper for U.S. Census Geocoding Services API batch service☆42Updated 4 months ago
- Tool for probabilistically linking the records of individual entities (e.g. people) within and across datasets☆111Updated 3 months ago
- Get Census Data from the API for arbitrary areas☆45Updated 6 months ago
- Dataset of state legislative elections from 1971–2018.☆45Updated 5 years ago
- Python wrapper for the US Census Geocoder☆75Updated 10 months ago
- Teaching guide for a one-hour hands-on session at an IRE/NICAR conference on using pandas to analyze data.☆20Updated 3 weeks ago
- A light-weight wrapper for the Datawrapper API.☆63Updated 8 months ago
- Materials for a NICAR 2020 workshop on advanced Census data with Python☆17Updated 2 years ago
- Text and statistics utilities from Pew Research Center☆84Updated 3 years ago
- A set of jupyter notebooks demonstrating how to use the Media Cloud API.☆37Updated last year
- ☆73Updated 11 months ago
- Public client for consuming content from the Media Cloud Online News Archive & Directory.☆72Updated 3 months ago
- a general list of resources and articles for people interested in getting into data journalism☆16Updated last year
- Fast, flexible name matching for large datasets☆71Updated last year
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆283Updated 2 years ago
- Loads raw FEC filings into a database☆22Updated 2 years ago
- Workbook to teach the concept of risk ratios for data journalism applications☆32Updated 2 years ago
- This repository includes data for snap analyses of the 2018 Midterm Elections using unofficial election returns data.☆49Updated 6 years ago
- A step-by-step guide to publishing a standalone story from a dataset.☆30Updated 3 weeks ago
- Fraud detection related data and scripts to share with partners.☆23Updated 2 years ago
- Guess gender from first name in Python 2 and 3☆133Updated 2 years ago
- The documentation and scripts for the Local News Dataset☆25Updated 2 years ago
- Incarceration Trends Dataset and Documentation☆91Updated 5 months ago
- Download IPEDS complete data files☆39Updated 7 years ago
- Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence☆63Updated last year
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- A maximum-strength name parser for record linkage.☆36Updated last month
- The open-source web scrapers that feed the Los Angeles Times California coronavirus tracker.☆58Updated this week
- Command-line interface for downloading WARN Act notices of qualified plant closings and mass layoffs from state government websites☆30Updated 2 months ago