easeml / datascope
Measuring data importance over ML pipelines using the Shapley value.
☆38Updated 2 months ago
Alternatives and similar repositories for datascope:
Users that are interested in datascope are comparing it to the libraries listed below
- (ICML 2021) Mandoline: Model Evaluation under Distribution Shift☆31Updated 3 years ago
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.☆42Updated last month
- Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning (AISTATS 2022 Oral)☆40Updated 2 years ago
- A benchmark of data-centric tasks from across the machine learning lifecycle.☆72Updated 2 years ago
- Distributional Shapley: A Distributional Framework for Data Valuation☆30Updated 11 months ago
- ☆35Updated last year
- Advances in Neural Information Processing Systems (NeurIPS 2021)☆22Updated 2 years ago
- Testing Language Models for Memorization of Tabular Datasets.☆33Updated 2 months ago
- automatic data slicing☆34Updated 3 years ago
- Official Repo for "Efficient task-specific data valuation for nearest neighbor algorithms"☆26Updated 5 years ago
- ☆89Updated last year
- Active and Sample-Efficient Model Evaluation☆24Updated 4 years ago
- ☆17Updated 4 years ago
- ☆37Updated 3 years ago
- ☆28Updated last year
- OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)☆95Updated 2 months ago
- Implementation of Influence Function approximations for differently sized ML models, using PyTorch☆15Updated last year
- AutoML Two-Sample Test☆19Updated 2 years ago
- Code repository for our paper "Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift": https://arxiv.org/abs/1810.119…☆104Updated last year
- ModelDiff: A Framework for Comparing Learning Algorithms☆56Updated last year
- PyTorch reimplementation of computing Shapley values via Truncated Monte Carlo sampling from "What is your data worth? Equitable Valuatio…☆27Updated 3 years ago
- Using / reproducing DAC from the paper "Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees"☆27Updated 4 years ago
- Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling☆31Updated 4 years ago
- Influence Estimation for Gradient-Boosted Decision Trees☆27Updated 10 months ago
- Efficient Computation and Analysis of Distributional Shapley Values (AISTATS 2021)☆21Updated last year
- 😇A curated list of links and resources for Fair ML and Data Ethics☆18Updated 2 years ago
- Logic Explained Networks is a python repository implementing explainable-by-design deep learning models.☆49Updated last year
- Data for "Datamodels: Predicting Predictions with Training Data"☆96Updated last year
- ☆32Updated 3 years ago
- Research on Tabular Foundation Models☆45Updated 4 months ago