HoloClean / holocleanLinks
A Machine Learning System for Data Enrichment.
☆531Updated 2 years ago
Alternatives and similar repositories for holoclean
Users that are interested in holoclean are comparing it to the libraries listed below
Sorting:
- A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels f…☆512Updated 9 months ago
- FlorDB 🌻☆157Updated 2 months ago
- An open source, high scalability toolkit in Java for Entity Resolution.☆221Updated 5 months ago
- python automatic data quality check toolkit☆281Updated 5 years ago
- ☆193Updated last year
- ☆79Updated 2 years ago
- ☆96Updated 5 years ago
- Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbation…☆164Updated 5 months ago
- Python package for performing Entity and Text Matching using Deep Learning.☆613Updated last year
- A list of free data matching and record linkage software.☆397Updated last year
- detect demographic differences in the output of machine learning models or other assessments☆320Updated 5 years ago
- Spark implementation of computing Shapley Values using monte-carlo approximation☆79Updated 2 years ago
- A distributed Spark/Scala implementation of the isolation forest algorithm for unsupervised outlier detection, featuring support for scal…☆250Updated 2 weeks ago
- More interactive weak supervision with FlyingSquid☆316Updated 5 years ago
- What's in your data? Extract schema, statistics and entities from datasets☆1,535Updated 3 months ago
- Python library for building highly effective data science workflows☆947Updated 2 years ago
- DeltaPy - Tabular Data Augmentation (by @firmai)☆556Updated 2 years ago
- Type System for Data Analysis in Python☆215Updated 11 months ago
- The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning wo…☆172Updated 2 weeks ago
- openclean - Data Cleaning and data profiling library for Python☆83Updated 4 years ago
- Distributed Bayesian Entity Resolution in Apache Spark☆58Updated 4 years ago
- A collection of tutorials for Snorkel☆407Updated last year
- HandySpark - bringing pandas-like capabilities to Spark dataframes☆197Updated 6 years ago
- Implementation of statistical models to analyze time lagged conversions☆263Updated last year
- Auto Tune Models - A multi-tenant, multi-data system for automated machine learning (model selection and tuning).☆527Updated 5 years ago
- The complete graph data science platform☆141Updated 10 months ago
- Joblib Apache Spark Backend☆249Updated 8 months ago
- 🐳 The stupidly simple CLI workspace for your data warehouse.☆728Updated 2 years ago
- A model-agnostic visual debugging tool for machine learning☆1,671Updated 10 months ago
- Library for Semi-Automated Data Science☆345Updated 2 months ago