data-centric-ai / dcbench
A benchmark of data-centric tasks from across the machine learning lifecycle.
☆72Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for dcbench
- ☆134Updated last year
- Code for Active Learning at The ImageNet Scale. This repository implements many popular active learning algorithms and allows training wi…☆52Updated 2 years ago
- 🛠️ Corrected Test Sets for ImageNet, MNIST, CIFAR, Caltech-256, QuickDraw, IMDB, Amazon Reviews, 20News, and AudioSet☆182Updated last year
- Weakly Supervised End-to-End Learning (NeurIPS 2021)☆153Updated last year
- Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction☆35Updated 2 years ago
- Combating hidden stratification with GEORGE☆62Updated 3 years ago
- Implementation of Estimating Training Data Influence by Tracing Gradient Descent (NeurIPS 2020)☆219Updated 2 years ago
- Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling☆30Updated 3 years ago
- ☆86Updated last year
- Code repository for our paper "Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift": https://arxiv.org/abs/1810.119…☆102Updated 7 months ago
- ☆103Updated last year
- Model Patching: Closing the Subgroup Performance Gap with Data Augmentation☆42Updated 4 years ago
- [NeurIPS 2021] WRENCH: Weak supeRvision bENCHmark☆220Updated 9 months ago
- Data for "Datamodels: Predicting Predictions with Training Data"☆90Updated last year
- A fast, effective data attribution method for neural networks in PyTorch☆179Updated this week
- Active and Sample-Efficient Model Evaluation☆24Updated 3 years ago
- This repository contains the code of the distribution shift framework presented in A Fine-Grained Analysis on Distribution Shift (Wiles e…☆80Updated 3 weeks ago
- MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)☆108Updated 2 years ago
- Code for "Supermasks in Superposition"☆117Updated last year
- DISTIL: Deep dIverSified inTeractIve Learning. An active/inter-active learning library built on py-torch for reducing labeling costs.☆142Updated last year
- This repository holds code and other relevant files for the NeurIPS 2022 tutorial: Foundational Robustness of Foundation Models.☆70Updated last year
- ☆96Updated 2 years ago
- Reusable BatchBALD implementation☆74Updated 8 months ago
- automatic data slicing☆35Updated 3 years ago
- Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics☆192Updated 2 years ago
- Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using co…☆323Updated last year
- ☆61Updated 3 years ago
- Measuring data importance over ML pipelines using the Shapley value.☆36Updated 3 weeks ago
- ☆22Updated last year
- Original dataset release for CIFAR-10H☆82Updated 4 years ago