PovertyAction / PII_detection
Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets.
☆41Updated 3 years ago
Related projects: ⓘ
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆43Updated 5 years ago
- Record matching and entity resolution at scale in Spark☆31Updated 10 months ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆51Updated 3 weeks ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 2 weeks ago
- Instant search for and access to many datasets in Pyspark.☆34Updated last year
- Wrapper around Google APIs to create charts in Google Slides with python☆30Updated 2 years ago
- Using Jupyter notebook to develop DevOps automated environment to start and stop SageMaker notebook instances out of working hours☆22Updated 5 years ago
- Interactive notebooks containing demonstration code of the splink library☆38Updated 8 months ago
- How to do data science with Optimus, Spark and Python.☆18Updated 5 years ago
- A small Python module containing quick utility functions for standard ETL processes.☆33Updated this week
- Example of a Streamlit data app powered by Vaex☆10Updated 2 years ago
- ☆14Updated last year
- Fast, flexible name matching for large datasets☆69Updated 9 months ago
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆43Updated 2 months ago
- A repository for all sample plugins created with the Alteryx python SDK☆25Updated 6 years ago
- ☆19Updated 3 years ago
- Library for identification, anonymization and de-anonymization of PII data☆22Updated last year
- ☆29Updated 9 months ago
- ☆12Updated 11 months ago
- A hands-on tutorial showing how to use Python to do anonymisation with synthetic data☆79Updated 2 years ago
- Basic tutorial of using Apache Airflow☆35Updated 5 years ago
- MLOps simplified. One platform, all the functionality you need. Swiss made☆94Updated last week
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browser☆32Updated last year
- A Scalable Data Cleaning Library for PySpark.☆26Updated 5 years ago
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago
- Python package for text mining of time-series data☆66Updated 2 weeks ago
- A project to build a machine learning pipeline to detect personal identifiable information (PII)☆16Updated last year
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆53Updated 2 years ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆34Updated 4 years ago
- a convenient way to anonymize your data for analytics☆20Updated 2 years ago