gretelai / gretel-tools
General tools for machine learning, data engineering, and more!
☆14Updated last year
Related projects ⓘ
Alternatives and complementary repositories for gretel-tools
- Public blueprints for data use cases☆72Updated this week
- Synthetic data generators for structured and unstructured text, featuring differentially private learning.☆596Updated this week
- Use FastCUT with public map images and location data from a few cities to generate realistic synthetic location data for any city in the …☆22Updated 2 years ago
- PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such a…☆276Updated last month
- A library of Reversible Data Transforms☆121Updated this week
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆43Updated 3 years ago
- Metrics to evaluate quality and efficacy of synthetic datasets.☆212Updated this week
- Data Privacy Toolkit☆36Updated 2 months ago
- The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning wo…☆168Updated last year
- A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data…☆243Updated 6 months ago
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆88Updated 2 years ago
- ☆260Updated 7 months ago
- Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark☆77Updated last year
- Generating Realistic Synthetic Data☆31Updated 9 months ago
- Machine learning prediction in pure Python☆86Updated 3 years ago
- Benchmarking synthetic data generation methods.☆262Updated this week
- ☀️🦶 A lightweight framework for collaborative, open-source feature engineering☆32Updated 3 years ago
- This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire…☆177Updated this week
- Python language bindings for smartnoise-core.☆75Updated last year
- Metaflow tutorials for ODSC West 2021☆65Updated 3 years ago
- Where Gretel published notebooks and code for blog posts☆19Updated last year
- openclean - Data Cleaning and data profiling library for Python☆69Updated 3 years ago
- 🧬 A JupyterLab extension for annotating data with Prodigy☆188Updated last year
- Building a Deep Learning Powered Emoji Slackbot!☆16Updated 4 years ago
- Tools and service for differentially private processing of tabular and relational data☆254Updated 3 months ago
- Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbation…☆165Updated 2 months ago
- Expose a Top2Vec model with a REST API.☆88Updated last year
- A software package for privacy-preserving generation of a synthetic twin to a given sensitive data set.☆48Updated 2 months ago
- Privacy preserving synthetic data generation workflows☆20Updated 2 years ago