PovertyAction / PII_detection
Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets.
☆43Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for PII_detection
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆43Updated 5 years ago
- Interactive notebooks containing demonstration code of the splink library☆37Updated 10 months ago
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆44Updated 4 months ago
- Library for identification, anonymization and de-anonymization of PII data☆22Updated last year
- PySpark phonetic and string matching algorithms☆35Updated 9 months ago
- Cohort extractor tool which can generate dummy data, or real data against OpenSAFELY-compliant research databases☆38Updated 3 months ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆52Updated 3 weeks ago
- Content for a talk on "The wonderful world of data quality tools in Python"☆18Updated 3 years ago
- How to do data science with Optimus, Spark and Python.☆18Updated 5 years ago
- ☆17Updated 6 months ago
- Wrapper around Google APIs to create charts in Google Slides with python☆32Updated 2 years ago
- ☆29Updated last week
- a convenient way to anonymize your data for analytics☆20Updated 3 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- A hands-on tutorial showing how to use Python to do anonymisation with synthetic data☆78Updated 2 years ago
- Helper code to interact with Rasgo via our SDK, PyRasgo☆40Updated last year
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. I…☆19Updated 2 years ago
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆17Updated 3 weeks ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 2 months ago
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browser☆33Updated last year
- Dashboard for Data Drift Detection in Python with Evidently and Mercury☆14Updated 2 years ago
- Record matching and entity resolution at scale in Spark☆31Updated last year
- ☆12Updated last year
- Fully unit tested utility functions for data engineering. Python 3 only.☆14Updated 3 months ago
- Streamlit example showing Scikit Learn & Pyspark ML over Healthcare data ! Its simple !!☆30Updated 3 years ago
- Privacy preserving synthetic data generation workflows☆20Updated 2 years ago
- Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.☆75Updated last week
- Demo on how to use Prefect with Docker☆26Updated 2 years ago