thoughtworks-datakind / anonymizer
Library for identification, anonymization and de-anonymization of PII data
☆22Updated last year
Related projects: ⓘ
- How to do data science with Optimus, Spark and Python.☆18Updated 5 years ago
- ☆25Updated 5 years ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆43Updated 5 years ago
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆41Updated 3 years ago
- ElasticSearch implementation of MlFlow tracking store☆16Updated 3 years ago
- Synthetic data generation for graph ML experiments☆23Updated 3 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated last year
- Data Lineage Tracing Library☆21Updated 2 years ago
- Apache NiFi Custom Processor for working with Stanford CoreNLP for Sentiment Analysis in Java 8☆11Updated 6 years ago
- Generating Realistic Synthetic Data☆28Updated 7 months ago
- Docker compose and Google Colab demo to build a CDC with Delta Lake☆15Updated 2 years ago
- A few end to end examples that use data-describe☆16Updated last year
- Artificial Intelligence for Business Leaders☆13Updated last year
- Code examples for the Introduction to Kubeflow course☆13Updated 3 years ago
- This repository contains NiFi processors for interacting with Snowflake Cloud Data Platform.☆12Updated 11 months ago
- Spark NLP for Streamlit☆15Updated 3 years ago
- ☆19Updated last month
- This project is created to promote and advocate the use of FOSS machine learning.☆44Updated 2 weeks ago
- Generalized project for running Airflow DAGs, with possibility of skipping tasks already done for some set of input parameters.☆14Updated last year
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆23Updated 2 years ago
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆17Updated this week
- Build a semantic search application with deep learning models.☆13Updated 11 months ago
- Record matching and entity resolution at scale in Spark☆31Updated 10 months ago
- A Scalable Data Cleaning Library for PySpark.☆26Updated 5 years ago
- A small Python module containing quick utility functions for standard ETL processes.☆33Updated this week
- ☆17Updated last year
- MLOps simplified. One platform, all the functionality you need. Swiss made☆94Updated last week
- Model explanation provides the ability to interpret the effect of the predictors on the composition of an individual score.☆12Updated 3 years ago
- ICIJ #Fincen Files in Neo4j☆32Updated 3 years ago
- Streamlit example showing Scikit Learn & Pyspark ML over Healthcare data ! Its simple !!☆28Updated 3 years ago