google-research / interpretability-theoryLinks
☆27Updated 2 years ago
Alternatives and similar repositories for interpretability-theory
Users that are interested in interpretability-theory are comparing it to the libraries listed below
Sorting:
- ☆37Updated last year
- Recycling diverse models☆45Updated 2 years ago
- ModelDiff: A Framework for Comparing Learning Algorithms☆59Updated 2 years ago
- Official code for the paper: "Metadata Archaeology"☆19Updated 2 years ago
- BenchBench is a Python package to evaluate multi-task benchmarks.☆16Updated last year
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.☆43Updated 5 months ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated 2 years ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated last year
- Code accompanying paper: Meta-Learning to Improve Pre-Training☆37Updated 3 years ago
- Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling☆31Updated 4 years ago
- Data for "Datamodels: Predicting Predictions with Training Data"☆97Updated 2 years ago
- Learning to Split for Automatic Bias Detection☆47Updated 2 years ago
- Implementations of growing and pruning in neural networks☆22Updated 2 years ago
- Library implementing state-of-the-art Concept-based and Disentanglement Learning methods for Explainable AI☆55Updated 3 years ago
- DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization☆31Updated 2 years ago
- Code for the paper "Data Feedback Loops: Model-driven Amplification of Dataset Biases"☆16Updated 2 years ago
- ☆95Updated 2 years ago
- ☆60Updated 3 years ago
- Model Patching: Closing the Subgroup Performance Gap with Data Augmentation☆42Updated 4 years ago
- A weak supervision framework for (partial) labeling functions☆16Updated last year
- (ICML 2021) Mandoline: Model Evaluation under Distribution Shift☆30Updated 4 years ago
- Measuring if attention is explanation with ROAR☆22Updated 2 years ago
- ☆108Updated 2 years ago
- A Domain-Agnostic Benchmark for Self-Supervised Learning☆107Updated 2 years ago
- Logic Explained Networks is a python repository implementing explainable-by-design deep learning models.☆51Updated 2 years ago
- Minimum Description Length probing for neural network representations☆18Updated 7 months ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆33Updated 2 years ago
- Quantification of Uncertainty with Adversarial Models☆30Updated 2 years ago
- Personal implementation of ASIF by Antonio Norelli☆25Updated last year
- ☆132Updated last month