schelterlabs / jengaLinks
Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptions (e.g., missing values, broken character encodings) on the prediction quality of their ML models.
☆41Updated 2 years ago
Alternatives and similar repositories for jenga
Users that are interested in jenga are comparing it to the libraries listed below
Sorting:
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- A Benchmark for Joint Data Cleaning and Machine Learning☆50Updated last year
- Code repository for our paper "Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift": https://arxiv.org/abs/1810.119…☆107Updated last year
- The official implementation of "The Shapley Value of Classifiers in Ensemble Games" (CIKM 2021).☆223Updated 2 weeks ago
- A Tree Search Library for Data Cleaning☆22Updated 3 years ago
- automatic data slicing☆35Updated 4 years ago
- Weakly Supervised End-to-End Learning (NeurIPS 2021)☆156Updated 2 years ago
- Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning☆53Updated 3 years ago
- Spark implementation of computing Shapley Values using monte-carlo approximation☆79Updated 2 years ago
- openclean - Data Cleaning and data profiling library for Python☆83Updated 4 years ago
- Metrics to evaluate quality and efficacy of synthetic datasets.☆256Updated this week
- Clustering for mixed-type data☆101Updated last year
- Benchmarking synthetic data generation methods.☆301Updated this week
- A library of Reversible Data Transforms☆131Updated this week
- SPEAR: Programmatically label and build training data quickly.☆109Updated last year
- ✂️ Fast slice finding for Machine Learning model debugging.☆97Updated 2 weeks ago
- ☆33Updated 4 years ago
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.☆41Updated 2 years ago
- ☆32Updated 4 years ago
- Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbation…☆164Updated 6 months ago
- Distribution transparent Machine Learning experiments on Apache Spark☆91Updated last year
- FlorDB 🌻☆158Updated 3 months ago
- Editing machine learning models to reflect human knowledge and values☆128Updated 2 years ago
- ☆22Updated 2 years ago
- Train Gradient Boosting models that are both high-performance *and* Fair!☆106Updated 3 weeks ago
- this repo might get accepted☆28Updated 4 years ago
- Public home of pycorels, the python binding to CORELS☆80Updated 5 years ago
- Code and data for Sato https://arxiv.org/abs/1911.06311.☆116Updated last year
- Implementation of Rank-biased Overlap☆153Updated last year
- The stream-learn is an open-source Python library for difficult data stream analysis.☆66Updated 4 months ago