Measuring data importance over ML pipelines using the Shapley value.
☆45Aug 26, 2025Updated 6 months ago
Alternatives and similar repositories for datascope
Users that are interested in datascope are comparing it to the libraries listed below
Sorting:
- [CVPR 2021] Scalability vs. Utility: Do We Have to Sacrifice One for the Other in Data Importance Quantification?☆34Dec 26, 2020Updated 5 years ago
- Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value (ICML 2023)☆21Jul 26, 2023Updated 2 years ago
- Scalable data valuation using optimal transport (ICLR 2025)☆13Jul 15, 2025Updated 8 months ago
- Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning (AISTATS 2022 Oral)☆43Nov 10, 2022Updated 3 years ago
- Data Shapley: Equitable Valuation of Data for Machine Learning☆290May 1, 2024Updated last year
- PyTorch reimplementation of computing Shapley values via Truncated Monte Carlo sampling from "What is your data worth? Equitable Valuatio…☆27Jan 21, 2022Updated 4 years ago
- Now it is exported as an official example☆13Jan 24, 2018Updated 8 years ago
- Automatic Differentiation for Gradient Boosted Decision Trees.☆13May 17, 2022Updated 3 years ago
- Alpha version of our data-centric visual benchmark for training data selection☆16Aug 28, 2023Updated 2 years ago
- Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)☆18Mar 20, 2023Updated 3 years ago
- Distributional Shapley: A Distributional Framework for Data Valuation☆30May 1, 2024Updated last year
- Time series data contribution via influence functions☆17Jan 18, 2025Updated last year
- ArXiv Sanidade é uma aplicação web que ajuda os usuários a descobrir e salvar artigos relevantes do arXiv usando machine learning. Ele u…☆16Sep 27, 2024Updated last year
- Source code and data for the paper "SALT: Sales Autocompletion Linked Business Tables Dataset"☆35Jul 9, 2025Updated 8 months ago
- This is an official repository for "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR2023).☆53Jun 5, 2024Updated last year
- Implementation of FedBary☆16Mar 24, 2025Updated 11 months ago
- ☆18May 25, 2022Updated 3 years ago
- Implementation of Geolocated Articles Processing and Poverty Mapping - [KDD19]☆19Apr 24, 2021Updated 4 years ago
- pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation☆144Feb 11, 2026Updated last month
- AQuA: A Benchmarking Tool for Label Quality Assessment, NeurIPS'23 D&B☆23Oct 17, 2023Updated 2 years ago
- Domain Adaptation☆23Nov 27, 2021Updated 4 years ago
- Advanced Machine Learning Course☆13Nov 16, 2024Updated last year
- A Data-Centric library providing a unified interface for state-of-the-art methods for hardness characterisation of data points.☆25Mar 6, 2025Updated last year
- a (work-in-progress) grammatical WISYWIG text editor☆12Sep 13, 2018Updated 7 years ago
- achieve modularity and separation of concerns through feature-oriented development☆20May 6, 2015Updated 10 years ago
- An tensorflow implementation of ghostvlad for speaker recognition☆15May 2, 2019Updated 6 years ago
- LaTeX Template for Fudan University School of Computer Science 2024☆11May 21, 2024Updated last year
- ☆18Jun 22, 2024Updated last year
- ☆11Oct 8, 2021Updated 4 years ago
- 💱 A curated list of data valuation (DV) to design your next data marketplace☆138Feb 20, 2025Updated last year
- This repository contains the artifacts accompanied by the paper "Fair Preprocessing"☆13Jul 20, 2021Updated 4 years ago
- Data visualization workshop☆11May 12, 2020Updated 5 years ago
- Very basic solar regression☆16Updated this week
- ☆13Oct 3, 2024Updated last year
- [ICLR 2024] Contrastive Learning Is Spectral Clustering On Similarity Graph (https://arxiv.org/abs/2303.15103)☆21Sep 17, 2024Updated last year
- ☆11Dec 1, 2023Updated 2 years ago
- Performant, composable online learning☆16Feb 22, 2021Updated 5 years ago
- ☆23Nov 1, 2022Updated 3 years ago
- ☆15Nov 3, 2022Updated 3 years ago