daviddao / awesome-data-valuationView external linksLinks
π± A curated list of data valuation (DV) to design your next data marketplace
β137Feb 20, 2025Updated 11 months ago
Alternatives and similar repositories for awesome-data-valuation
Users that are interested in awesome-data-valuation are comparing it to the libraries listed below
Sorting:
- This is an official repository for "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR2023).β52Jun 5, 2024Updated last year
- Scalable data valuation using optimal transport (ICLR 2025)β13Jul 15, 2025Updated 6 months ago
- Distributional Shapley: A Distributional Framework for Data Valuationβ30May 1, 2024Updated last year
- This is an official repository for "Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources" (β¦β14Oct 26, 2023Updated 2 years ago
- Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value (ICML 2023)β21Jul 26, 2023Updated 2 years ago
- Data Shapley: Equitable Valuation of Data for Machine Learningβ288May 1, 2024Updated last year
- A Python Data Valuation Packageβ33Feb 3, 2023Updated 3 years ago
- Data Banzhaf: A Robust Data Valuation Framework for Machine Learning (AISTATS 2023 Oral)β18Oct 15, 2023Updated 2 years ago
- PyTorch reimplementation of computing Shapley values via Truncated Monte Carlo sampling from "What is your data worth? Equitable Valuatioβ¦β27Jan 21, 2022Updated 4 years ago
- β17Mar 23, 2025Updated 10 months ago
- An interpretable network to compute the Shapley values in a single forward propagation.β16May 24, 2023Updated 2 years ago
- [ICML'21] Estimate the accuracy of the classifier in various environments through self-supervisionβ27Sep 2, 2021Updated 4 years ago
- A fast, effective data attribution method for neural networks in PyTorchβ229Nov 18, 2024Updated last year
- [ICML'2022] Estimating Instance-dependent Bayes-label Transition Matrix using a Deep Neural Networkβ21Jul 19, 2022Updated 3 years ago
- A library for language transfer methods and algorithms.β16Feb 6, 2026Updated last week
- Implementation of the spotlight: a method for discovering systematic errors in deep learning modelsβ11Oct 5, 2021Updated 4 years ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]β79Nov 14, 2024Updated last year
- Awesome coreset/core-set/subset/sample selection works.β182Jun 30, 2024Updated last year
- Code for the paper "Pretrained Models for Multilingual Federated Learning" at NAACL 2022β11Aug 9, 2022Updated 3 years ago
- A Survey on Data Selection for Language Modelsβ254Apr 29, 2025Updated 9 months ago
- Experiments for the NeurIPS 2021 paper "Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks"β13Oct 25, 2021Updated 4 years ago
- GhostSuite (Official Codebase for "Data Shapley in One Training Run", ICLR'25)β31Jan 16, 2026Updated 3 weeks ago
- A library for calibrating classifiers and computing calibration metricsβ14Nov 28, 2022Updated 3 years ago
- Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvatureβ180Jun 24, 2025Updated 7 months ago
- Code for ICLR 2022 Paper, "Controlling Directions Orthogonal to a Classifier"β35Jun 6, 2023Updated 2 years ago
- "An AGI architecture in vector space" Paper to be submitted to AGI-16β14Nov 16, 2024Updated last year
- NeurIPS 2022: Estimating Noise Transition Matrix with Label Correlations for Noisy Multi-Label Learningβ18Mar 3, 2023Updated 2 years ago
- β15Mar 31, 2022Updated 3 years ago
- Efficient approximation algorithms for Shapley Values in Horizontal Enterprise Federated Learningβ13May 5, 2020Updated 5 years ago
- An amortized approach for calculating local Shapley value explanationsβ105Nov 29, 2023Updated 2 years ago
- Code Release for Learning to Adapt to Evolving Domainsβ31Jul 21, 2021Updated 4 years ago
- β18Jul 24, 2023Updated 2 years ago
- [CVPR 2021] Scalability vs. Utility: Do We Have to Sacriο¬ce One for the Other in Data Importance Quantiο¬cation?β33Dec 26, 2020Updated 5 years ago
- Measuring data importance over ML pipelines using the Shapley value.β45Aug 26, 2025Updated 5 months ago
- Benchmarks for semi-supervised domain generalization.β72Aug 25, 2022Updated 3 years ago
- Code for NeurIPS'23 paper "A Bayesian Approach To Analysing Training Data Attribution In Deep Learning"β17Jan 12, 2024Updated 2 years ago
- MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)β108Aug 29, 2022Updated 3 years ago
- ICLR 2022 (Spolight): Continual Learning With Filter Atom Swappingβ16Jul 5, 2023Updated 2 years ago
- β21Dec 30, 2022Updated 3 years ago