schelterlabs / jengaLinks
Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptions (e.g., missing values, broken character encodings) on the prediction quality of their ML models.
☆40Updated 2 years ago
Alternatives and similar repositories for jenga
Users that are interested in jenga are comparing it to the libraries listed below
Sorting:
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- Code repository for our paper "Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift": https://arxiv.org/abs/1810.119…☆105Updated last year
- A Benchmark for Joint Data Cleaning and Machine Learning☆48Updated last year
- automatic data slicing☆34Updated 3 years ago
- Data-Centric What-If Analysis for Native Machine Learning Pipelines☆16Updated 2 years ago
- ☆22Updated last year
- SPEAR: Programmatically label and build training data quickly.☆107Updated 11 months ago
- Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning☆51Updated 2 years ago
- Data Cleaning for ML under the Certain Prediction Framework☆11Updated 3 years ago
- Foundation Models for Data Tasks☆106Updated 2 years ago
- Editing machine learning models to reflect human knowledge and values☆126Updated last year
- openclean - Data Cleaning and data profiling library for Python☆79Updated 3 years ago
- Measuring data importance over ML pipelines using the Shapley value.☆42Updated last month
- Spark implementation of computing Shapley Values using monte-carlo approximation☆74Updated 2 years ago
- Model Agnostic Counterfactual Explanations☆87Updated 2 years ago
- A library of Reversible Data Transforms☆127Updated this week
- The official implementation of "The Shapley Value of Classifiers in Ensemble Games" (CIKM 2021).☆220Updated last year
- Train Gradient Boosting models that are both high-performance *and* Fair!☆105Updated last year
- A Tree Search Library for Data Cleaning☆22Updated 3 years ago
- A benchmark of data-centric tasks from across the machine learning lifecycle.☆72Updated 3 years ago
- Python Interface of the Scalable Bayesian Rule Lists☆20Updated 5 years ago
- A practical Active Learning python package with a strong focus on experiments.☆51Updated 2 years ago
- A Natural Language Interface to Explainable Boosting Machines☆67Updated 11 months ago
- ☆29Updated 3 years ago
- Code to reproduce the results in the paper Supervised Learning on Relational Databases with Graph Neural Networks.☆63Updated 5 years ago
- Explaining Inference Queries with Bayesian Optimization☆10Updated 4 years ago
- Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning (AISTATS 2022 Oral)☆41Updated 2 years ago
- Weakly Supervised End-to-End Learning (NeurIPS 2021)☆157Updated 2 years ago
- Extremely simple and fast extreme multi-class and multi-label classifiers.☆69Updated 2 months ago
- A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching …☆88Updated 3 weeks ago