LaureBerti / Learn2Clean
Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning
☆51Updated 2 years ago
Alternatives and similar repositories for Learn2Clean:
Users that are interested in Learn2Clean are comparing it to the libraries listed below
- Record matching and entity resolution at scale in Spark☆34Updated last year
- Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptio…☆39Updated last year
- Editing machine learning models to reflect human knowledge and values☆124Updated last year
- Python library to explain Tree Ensemble models (TE) like XGBoost, using a rule list.☆53Updated 11 months ago
- openclean - Data Cleaning and data profiling library for Python☆75Updated 3 years ago
- An abstraction layer for parameter tuning☆35Updated 7 months ago
- Unified slicing for all Python data structures.☆35Updated 2 months ago
- Explore and compare 1K+ accurate decision trees in your browser!☆160Updated last year
- CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system☆76Updated 2 years ago
- Helpers for scikit learn☆16Updated 2 years ago
- An automated machine learning tool aimed to facilitate AutoML research.☆97Updated 7 months ago
- How to use SHAP values for better cluster analysis☆57Updated 2 years ago
- Exploring some issues related to churn☆16Updated last year
- GAM (Global Attribution Mapping) explains the landscape of neural network predictions across subpopulations☆33Updated 3 months ago
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. I…☆22Updated 2 years ago
- Python package for deduplication/entity resolution using active learning☆78Updated 7 months ago
- An automation tool to refactor Jupyter Notebooks to Python modules, with code dependency analysis.☆12Updated last month
- Pipeline Profiler is a tool for visualizing machine learning pipelines generated by AutoML tools.☆84Updated last year
- Abstractions for feature engineering on large graphs of tabular data.☆21Updated last week
- Repository for my master thesis on automated string handling☆16Updated 3 years ago
- scikit-mine : pattern mining in Python☆73Updated last year
- ☆20Updated last year
- A more flexible alternative to scikit-learn Pipelines☆33Updated 10 months ago
- ⚓ Eurybia monitors model drift over time and securizes model deployment with data validation☆207Updated 5 months ago
- ACV is a python library that provides explanations for any machine learning model or data. It gives local rule-based explanations for any…☆100Updated 2 years ago
- The official implementation of "The Shapley Value of Classifiers in Ensemble Games" (CIKM 2021).☆219Updated last year
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- ☆32Updated 3 years ago
- this repo might get accepted☆28Updated 4 years ago
- Exploratory repository to study predictive survival analysis models☆34Updated last year