chu-data-lab / CPClean
Data Cleaning for ML under the Certain Prediction Framework
☆11Updated 3 years ago
Alternatives and similar repositories for CPClean
Users that are interested in CPClean are comparing it to the libraries listed below
Sorting:
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptio…☆39Updated last year
- ☆22Updated last year
- A Benchmark for Joint Data Cleaning and Machine Learning☆48Updated 11 months ago
- Code repository for our paper "Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift": https://arxiv.org/abs/1810.119…☆104Updated last year
- Model Agnostic Counterfactual Explanations☆87Updated 2 years ago
- Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning (AISTATS 2022 Oral)☆41Updated 2 years ago
- Measuring data importance over ML pipelines using the Shapley value.☆42Updated this week
- Public home of pycorels, the python binding to CORELS☆80Updated 4 years ago
- Repository for "Online Active Model Selection for Pre-trained ML Classifiers"☆15Updated 2 years ago
- ☆32Updated 3 years ago
- 💱 A curated list of data valuation (DV) to design your next data marketplace☆118Updated 2 months ago
- Editing machine learning models to reflect human knowledge and values☆124Updated last year
- (ICML 2021) Mandoline: Model Evaluation under Distribution Shift☆31Updated 3 years ago
- Hyperparameter tuning via uncertainty modeling☆47Updated last year
- Distributional Shapley: A Distributional Framework for Data Valuation☆30Updated last year
- A Natural Language Interface to Explainable Boosting Machines☆66Updated 10 months ago
- A practical Active Learning python package with a strong focus on experiments.☆51Updated 2 years ago
- automatic data slicing☆34Updated 3 years ago
- An automated machine learning tool aimed to facilitate AutoML research.☆99Updated 8 months ago
- A software package for privacy-preserving generation of a synthetic twin to a given sensitive data set.☆52Updated 8 months ago
- ☆17Updated 4 years ago
- ✂️ Fast slice finding for Machine Learning model debugging.☆91Updated 2 weeks ago
- Python Meta-Feature Extractor package.☆133Updated 10 months ago
- Pipeline Profiler is a tool for visualizing machine learning pipelines generated by AutoML tools.☆84Updated last year
- Using / reproducing DAC from the paper "Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees"☆28Updated 4 years ago
- An implementation of IDS (Interpretable Decision Sets) algorithm.☆24Updated 4 years ago
- ☆12Updated last month
- A benchmark for distribution shift in tabular data☆52Updated 11 months ago
- An Empirical Framework for Domain Generalization In Clinical Settings☆30Updated 3 years ago