socialfoundations / benchbenchLinks

BenchBench is a Python package to evaluate multi-task benchmarks.

☆15

Alternatives and similar repositories for benchbench

Users that are interested in benchbench are comparing it to the libraries listed below

Sorting:

js-d / sim_metric
☆35Updated last year
JonasGeiping / dataaugs
☆18Updated 2 years ago
facebookresearch / ModelRatatouille
Recycling diverse models
☆44Updated 2 years ago
MadryLab / modeldiff
ModelDiff: A Framework for Comparing Learning Algorithms
☆57Updated last year
aangelopoulos / private_prediction_sets
Wrap around any model to output differentially private prediction sets with finite sample validity on any dataset.
☆17Updated last year
izmailovpavel / spurious_feature_learning
☆45Updated 2 years ago
alexrame / diwa
DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization
☆31Updated 2 years ago
acmi-lab / RLSbench
Code and results accompanying our paper titled RLSbench: Domain Adaptation under Relaxed Label Shift
☆34Updated last year
ykwon0407 / beta_shapley
Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning (AISTATS 2022 Oral)
☆41Updated 2 years ago
noranta4 / ASIF
Personal implementation of ASIF by Antonio Norelli
☆25Updated last year
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆30Updated 2 years ago
aniruddhraghu / meta-pretraining
Code accompanying paper: Meta-Learning to Improve Pre-Training
☆37Updated 3 years ago
gortizji / linearized-networks
Source code of "What can linearized neural networks actually say about generalization?
☆20Updated 3 years ago
google-research / understanding-curricula
☆34Updated 3 weeks ago
ae-foster / invclr
Improving Transformation Invariance in Contrastive Representation Learning
☆13Updated 4 years ago
HazyResearch / model-patching
Model Patching: Closing the Subgroup Performance Gap with Data Augmentation
☆42Updated 4 years ago
rtaori / data_feedback
Code for the paper "Data Feedback Loops: Model-driven Amplification of Dataset Biases"
☆16Updated 2 years ago
MadryLab / DebuggableDeepNetworks
☆38Updated 4 years ago
snu-mllab / Neural-Relation-Graph
Official PyTorch implementation of "Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data" (NeurIPS'23)
☆15Updated last year
stanislavfort / adversaries_to_OOD_detection
☆12Updated 2 years ago
singlasahil14 / barlow
Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction
☆36Updated 3 years ago
TristanThrush / perplexity-correlations
Simple and scalable tools for data-driven pretraining data selection.
☆24Updated 2 weeks ago
MadryLab / ImageNetMultiLabel
Fine-grained ImageNet annotations
☆29Updated 5 years ago
MadryLab / EditingClassifiers
☆95Updated 2 years ago
shoaibahmed / metadata_archaeology
Official code for the paper: "Metadata Archaeology"
☆19Updated 2 years ago
tml-epfl / sharpness-vs-generalization
A modern look at the relationship between sharpness and generalization [ICML 2023]
☆43Updated last year
visinf / fast-axiomatic-attribution
Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)
☆16Updated 2 years ago
kakaobrain / irm-empirical-study
An Empirical Study of Invariant Risk Minimization
☆27Updated 4 years ago
google-research / jax-influence
☆60Updated 3 years ago
ganguli-lab / degrees-of-freedom
☆37Updated 3 years ago