HoloClean / holocleanLinks
A Machine Learning System for Data Enrichment.
☆520Updated last year
Alternatives and similar repositories for holoclean
Users that are interested in holoclean are comparing it to the libraries listed below
Sorting:
- More interactive weak supervision with FlyingSquid☆315Updated 4 years ago
- A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels f…☆507Updated 2 months ago
- ☆190Updated last year
- Type System for Data Analysis in Python☆212Updated 4 months ago
- python automatic data quality check toolkit☆283Updated 4 years ago
- Python package for performing Entity and Text Matching using Deep Learning.☆593Updated last year
- Coarse-grained lineage and tracing for machine learning pipelines.☆469Updated 2 years ago
- Flow with FlorDB 🌻☆154Updated 3 weeks ago
- Tool to automate data quality checks on data pipelines☆255Updated 2 years ago
- A collection of tutorials for Snorkel☆397Updated 7 months ago
- Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbation…☆165Updated 4 months ago
- A distributed Spark/Scala implementation of the isolation forest algorithm for unsupervised outlier detection, featuring support for scal…☆248Updated 2 months ago
- Interpret Community extends Interpret repository with additional interpretability techniques and utility functions to handle real-world d…☆431Updated 4 months ago
- Random dataframe and database table generator☆309Updated 4 years ago
- A model-agnostic visual debugging tool for machine learning☆1,666Updated 4 months ago
- 🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)☆141Updated last year
- openclean - Data Cleaning and data profiling library for Python☆79Updated 3 years ago
- The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning wo…☆171Updated 2 years ago
- Generate and Visualize Data Lineage from query history☆326Updated last year
- python library for automated dataset normalization☆115Updated last year
- Source code for several Metanome data profiling algorithms☆55Updated 2 years ago
- ☆11Updated 4 years ago
- The complete graph data science platform☆139Updated 4 months ago
- data⎰describe: Pythonic EDA Accelerator for Data Science☆301Updated 2 years ago
- ☆77Updated 2 years ago
- This repository contains source code for the TaBERT model, a pre-trained language model for learning joint representations of natural lan…☆600Updated 3 years ago
- Hopsworks - Data-Intensive AI platform with a Feature Store☆1,229Updated 4 months ago
- Luminaire is a python package that provides ML driven solutions for monitoring time series data.☆781Updated last year
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆503Updated 4 months ago
- Joblib Apache Spark Backend☆248Updated 2 months ago