HoloClean / holocleanLinks
A Machine Learning System for Data Enrichment.
☆532Updated 2 years ago
Alternatives and similar repositories for holoclean
Users that are interested in holoclean are comparing it to the libraries listed below
Sorting:
- A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels f…☆510Updated 9 months ago
- ☆193Updated last year
- python automatic data quality check toolkit☆278Updated 5 years ago
- ☆96Updated 5 years ago
- Python package for performing Entity and Text Matching using Deep Learning.☆614Updated last year
- A distributed Spark/Scala implementation of the isolation forest algorithm for unsupervised outlier detection, featuring support for scal…☆250Updated this week
- Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbation…☆164Updated 6 months ago
- An open source, high scalability toolkit in Java for Entity Resolution.☆222Updated 6 months ago
- ☆78Updated 2 years ago
- FlorDB 🌻☆158Updated 3 months ago
- Spark implementation of computing Shapley Values using monte-carlo approximation☆80Updated 2 years ago
- Type System for Data Analysis in Python☆216Updated 11 months ago
- A collection of tutorials for Snorkel☆407Updated last year
- openclean - Data Cleaning and data profiling library for Python☆83Updated 4 years ago
- The complete graph data science platform☆142Updated 11 months ago
- Implementation of statistical models to analyze time lagged conversions☆263Updated last year
- The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning wo…☆173Updated last month
- A Benchmark for Joint Data Cleaning and Machine Learning☆50Updated last year
- Joblib Apache Spark Backend☆249Updated 9 months ago
- Code and data for Sato https://arxiv.org/abs/1911.06311.☆116Updated last year
- Human-explainable AI.☆529Updated 4 months ago
- Random dataframe and database table generator☆311Updated 4 years ago
- What's in your data? Extract schema, statistics and entities from datasets☆1,539Updated 4 months ago
- Data Analysis Baseline Library☆727Updated last year
- Coarse-grained lineage and tracing for machine learning pipelines.☆471Updated 3 years ago
- TypeDB-ML is the Machine Learning integrations library for TypeDB☆552Updated 2 years ago
- DeltaPy - Tabular Data Augmentation (by @firmai)☆556Updated 2 years ago
- A tool for compiling trained SKLearn models into other representations (such as SQL, Sympy or Excel formulas)☆176Updated 3 years ago
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆510Updated 2 weeks ago
- 🐳 The stupidly simple CLI workspace for your data warehouse.☆728Updated 2 years ago