schelterlabs / jengaLinks
Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptions (e.g., missing values, broken character encodings) on the prediction quality of their ML models.
☆41Updated 2 years ago
Alternatives and similar repositories for jenga
Users that are interested in jenga are comparing it to the libraries listed below
Sorting:
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- Code repository for our paper "Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift": https://arxiv.org/abs/1810.119…☆107Updated last year
- The official implementation of "The Shapley Value of Classifiers in Ensemble Games" (CIKM 2021).☆220Updated 2 years ago
- A Benchmark for Joint Data Cleaning and Machine Learning☆49Updated last year
- Weakly Supervised End-to-End Learning (NeurIPS 2021)☆156Updated 2 years ago
- automatic data slicing☆34Updated 4 years ago
- ☆104Updated last year
- A Tree Search Library for Data Cleaning☆22Updated 3 years ago
- ☆33Updated 4 years ago
- this repo might get accepted☆28Updated 4 years ago
- Benchmarking synthetic data generation methods.☆279Updated this week
- openclean - Data Cleaning and data profiling library for Python☆81Updated 3 years ago
- Extra functionalities for river☆14Updated last year
- Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning☆51Updated 2 years ago
- Measuring data importance over ML pipelines using the Shapley value.☆43Updated last month
- Spark implementation of computing Shapley Values using monte-carlo approximation☆76Updated 2 years ago
- Distribution transparent Machine Learning experiments on Apache Spark☆91Updated last year
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.☆42Updated 2 years ago
- Model Agnostic Counterfactual Explanations☆88Updated 3 years ago
- The stream-learn is an open-source Python library for difficult data stream analysis.☆64Updated last month
- A library of Reversible Data Transforms☆128Updated last week
- Data-Centric What-If Analysis for Native Machine Learning Pipelines☆16Updated 2 years ago
- ☆22Updated 2 years ago
- Metrics to evaluate quality and efficacy of synthetic datasets.☆247Updated 2 weeks ago
- Code for extracting, parsing and annotating tables from GitTables (https://gittables.github.io).☆45Updated 3 years ago
- ☆84Updated 3 months ago
- Python Interface of the Scalable Bayesian Rule Lists☆20Updated 5 years ago
- The Tornado framework, designed and implemented for adaptive online learning and data stream mining in Python.☆130Updated last year
- SPEAR: Programmatically label and build training data quickly.☆108Updated last year
- A distributed Spark/Scala implementation of the isolation forest algorithm for unsupervised outlier detection, featuring support for scal…☆249Updated last month