easeml / datascope
Measuring data importance over ML pipelines using the Shapley value.
☆38Updated last month
Alternatives and similar repositories for datascope:
Users that are interested in datascope are comparing it to the libraries listed below
- (ICML 2021) Mandoline: Model Evaluation under Distribution Shift☆31Updated 3 years ago
- Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning (AISTATS 2022 Oral)☆40Updated 2 years ago
- Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling☆31Updated 4 years ago
- ModelDiff: A Framework for Comparing Learning Algorithms☆56Updated last year
- ☆35Updated last year
- A lightweight implementation of removal-based explanations for ML models.☆59Updated 3 years ago
- Distributional Shapley: A Distributional Framework for Data Valuation☆30Updated 10 months ago
- A benchmark of data-centric tasks from across the machine learning lifecycle.☆72Updated 2 years ago
- Using / reproducing DAC from the paper "Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees"☆27Updated 4 years ago
- ☆87Updated last year
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.☆41Updated 2 weeks ago
- Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value (ICML 2023)☆18Updated last year
- Research on Tabular Foundation Models☆43Updated 3 months ago
- Testing Language Models for Memorization of Tabular Datasets.☆33Updated last month
- PyTorch reimplementation of computing Shapley values via Truncated Monte Carlo sampling from "What is your data worth? Equitable Valuatio…☆27Updated 3 years ago
- Advances in Neural Information Processing Systems (NeurIPS 2021)☆22Updated 2 years ago
- Influence Estimation for Gradient-Boosted Decision Trees☆26Updated 9 months ago
- TabDPT: Scaling Tabular Foundation Models☆26Updated last week
- Active and Sample-Efficient Model Evaluation☆24Updated 4 years ago
- OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)☆96Updated last month
- ☆17Updated 4 years ago
- Data for "Datamodels: Predicting Predictions with Training Data"☆95Updated last year
- ☆28Updated last year
- Official Repo for "Efficient task-specific data valuation for nearest neighbor algorithms"☆26Updated 5 years ago
- [NeurIPS 2020] Coresets for Robust Training of Neural Networks against Noisy Labels☆32Updated 3 years ago
- automatic data slicing☆35Updated 3 years ago
- AutoML Two-Sample Test☆19Updated 2 years ago
- Code repository for the AISTATS 2021 paper "Towards Understanding the Optimal Behaviors of Deep Active Learning Algorithms"☆15Updated 4 years ago
- Conformal prediction for controlling monotonic risk functions. Simple accompanying PyTorch code for conformal risk control in computer vi…☆62Updated 2 years ago
- ☆32Updated 3 years ago