thoughtworks-datakind / anonymizer
Library for identification, anonymization and de-anonymization of PII data
☆22Updated 2 years ago
Alternatives and similar repositories for anonymizer:
Users that are interested in anonymizer are comparing it to the libraries listed below
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆45Updated 3 years ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆44Updated 5 years ago
- Data Lineage Tracing Library☆22Updated 3 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆57Updated last week
- ☆14Updated 2 years ago
- How to do data science with Optimus, Spark and Python.☆19Updated 5 years ago
- Python ELT Studio, an application for building ELT (and ETL) data flows.☆57Updated 3 years ago
- Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.☆81Updated 2 weeks ago
- Automated Continuous Data Quality Measurement☆12Updated last year
- Apache NiFi Custom Processor for working with Stanford CoreNLP for Sentiment Analysis in Java 8☆11Updated 6 years ago
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. I…☆23Updated 2 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Datamallet is a python library which contains several helper functions and module for the common tasks in a typical data science workflow…☆11Updated 2 years ago
- Data extraction from documents with ML (research and experimental code repo)☆16Updated 2 years ago
- Streamlit example showing Scikit Learn & Pyspark ML over Healthcare data ! Its simple !!☆30Updated 4 years ago
- Text classification automl☆21Updated 3 years ago
- Use Watson Natural Language Understanding and Watson Knowledge Studio to fingerprint personal data from unstructured documents☆53Updated 3 years ago
- Generating Realistic Synthetic Data☆34Updated last year
- A few end to end examples that use data-describe☆16Updated last year
- Hassle-free ML Pipelines on Kubernetes☆38Updated last year
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆44Updated 9 months ago
- ☆12Updated 4 years ago
- This repository contains code to build an MVP search engine with google like interface.☆15Updated 4 years ago
- Apache NiFi NLP Processor☆18Updated last year
- Retrieval Augmented Generation applications☆26Updated last year
- Neural Solr = Solr 9 + Mighty Inference + Node☆17Updated 2 years ago
- Demonstration of how to perform continuous model monitoring on CML using Model Metrics and Evidently.ai dashboards☆12Updated 4 months ago
- ElasticSearch implementation of MlFlow tracking store☆18Updated 4 years ago
- ☆20Updated 3 years ago
- Record matching and entity resolution at scale in Spark☆34Updated last year