LaureBerti / Learn2CleanLinks
Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning
☆51Updated 2 years ago
Alternatives and similar repositories for Learn2Clean
Users that are interested in Learn2Clean are comparing it to the libraries listed below
Sorting:
- Record matching and entity resolution at scale in Spark☆34Updated last year
- Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptio…☆40Updated 2 years ago
- Unified slicing for all Python data structures.☆35Updated 4 months ago
- ☆29Updated 3 years ago
- An easier approach to using and understanding ML models☆23Updated last month
- A collection of data sets for stream learning.☆34Updated 5 years ago
- Sketch and LSH Index library for Java, including OPH methods as well as the Lazo method☆13Updated last year
- ☆22Updated last year
- Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"☆17Updated 3 years ago
- An automated machine learning tool aimed to facilitate AutoML research.☆99Updated 9 months ago
- Editing machine learning models to reflect human knowledge and values☆126Updated last year
- MinHash implementation in Python☆11Updated 10 months ago
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. I…☆23Updated 3 years ago
- Pipeline components that support partial_fit.☆46Updated 11 months ago
- ACV is a python library that provides explanations for any machine learning model or data. It gives local rule-based explanations for any…☆102Updated 2 years ago
- Python Interface of the Scalable Bayesian Rule Lists☆20Updated 5 years ago
- scikit-mine : pattern mining in Python☆73Updated 2 years ago
- Python package for deduplication/entity resolution using active learning☆80Updated 10 months ago
- Missing data amputation and exploration functions for Python☆71Updated 2 years ago
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- Exploring some issues related to churn☆16Updated last year
- openclean - Data Cleaning and data profiling library for Python☆79Updated 3 years ago
- A Tree Search Library for Data Cleaning☆22Updated 3 years ago
- GAM (Global Attribution Mapping) explains the landscape of neural network predictions across subpopulations☆34Updated 2 months ago
- FlexMatcher is a schema matching package in Python which handles the problem of matching multiple schemas to a single mediated schema.☆29Updated 6 months ago
- this repo might get accepted☆28Updated 4 years ago
- A library for feature selection for gradient boosting models using regression on feature Shapley values☆32Updated 7 months ago
- A Benchmark for Joint Data Cleaning and Machine Learning☆48Updated last year
- Automatic Feature Engineering for Time Series☆17Updated 2 years ago
- Optimization-Based Rule Learning for Classification☆47Updated last week