HoloClean / holocleanLinks
A Machine Learning System for Data Enrichment.
☆520Updated last year
Alternatives and similar repositories for holoclean
Users that are interested in holoclean are comparing it to the libraries listed below
Sorting:
- A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels f…☆507Updated 3 months ago
- A distributed Spark/Scala implementation of the isolation forest algorithm for unsupervised outlier detection, featuring support for scal…☆247Updated last week
- python automatic data quality check toolkit☆283Updated 4 years ago
- ☆96Updated 5 years ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,514Updated 7 months ago
- Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbation…☆165Updated last week
- ☆191Updated last year
- 🐳 The stupidly simple CLI workspace for your data warehouse.☆728Updated 2 years ago
- DeltaPy - Tabular Data Augmentation (by @firmai)☆548Updated last year
- A list of free data matching and record linkage software.☆387Updated last year
- A collection of tutorials for Snorkel☆398Updated 7 months ago
- Python package for performing Entity and Text Matching using Deep Learning.☆594Updated last year
- An open source, high scalability toolkit in Java for Entity Resolution.☆218Updated last year
- More interactive weak supervision with FlyingSquid☆315Updated 4 years ago
- Auto Tune Models - A multi-tenant, multi-data system for automated machine learning (model selection and tuning).☆530Updated 5 years ago
- A model-agnostic visual debugging tool for machine learning☆1,667Updated 5 months ago
- Joblib Apache Spark Backend☆249Updated 3 months ago
- A collection of demos showcasing automated feature engineering and machine learning in diverse use cases☆500Updated last year
- Python library for building highly effective data science workflows☆949Updated last year
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆503Updated 5 months ago
- Library for Semi-Automated Data Science☆339Updated 2 months ago
- detect demographic differences in the output of machine learning models or other assessments☆317Updated 4 years ago
- Type System for Data Analysis in Python☆213Updated 5 months ago
- Train and run Pytorch models on Apache Spark.☆339Updated 2 years ago
- Spark implementation of computing Shapley Values using monte-carlo approximation☆74Updated 2 years ago
- MLeap: Deploy ML Pipelines to Production☆1,515Updated 7 months ago
- Source code/webpage/demos for the What-If Tool☆962Updated 10 months ago
- Interpret Community extends Interpret repository with additional interpretability techniques and utility functions to handle real-world d…☆433Updated 5 months ago
- Human-explainable AI.☆527Updated last week
- Data Analysis Baseline Library☆728Updated 6 months ago