pankessel / adv_explanation_ref
reference implementation for "explanations can be manipulated and geometry is to blame"
☆36Updated 2 years ago
Alternatives and similar repositories for adv_explanation_ref:
Users that are interested in adv_explanation_ref are comparing it to the libraries listed below
- code release for the paper "On Completeness-aware Concept-Based Explanations in Deep Neural Networks"☆53Updated 2 years ago
- This repository provides a PyTorch implementation of "Fooling Neural Network Interpretations via Adversarial Model Manipulation". Our pap…☆22Updated 4 years ago
- ☆51Updated 4 years ago
- Code and data for the ICLR 2021 paper "Perceptual Adversarial Robustness: Defense Against Unseen Threat Models".☆55Updated 3 years ago
- Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)☆128Updated 3 years ago
- Python implementation for evaluating explanations presented in "On the (In)fidelity and Sensitivity for Explanations" in NeurIPS 2019 for…☆25Updated 2 years ago
- Understanding and Improving Fast Adversarial Training [NeurIPS 2020]☆95Updated 3 years ago
- ☆73Updated 4 years ago
- ☆37Updated last year
- Code for using CDEP from the paper "Interpretations are useful: penalizing explanations to align neural networks with prior knowledge" ht…☆127Updated 3 years ago
- Semisupervised learning for adversarial robustness https://arxiv.org/pdf/1905.13736.pdf☆140Updated 4 years ago
- Interpretation of Neural Network is Fragile☆36Updated 9 months ago
- Original dataset release for CIFAR-10H☆82Updated 4 years ago
- Explaining Image Classifiers by Counterfactual Generation☆28Updated 2 years ago
- Source code for "Neural Anisotropy Directions"☆15Updated 4 years ago
- ☆109Updated 2 years ago
- On the effectiveness of adversarial training against common corruptions [UAI 2022]☆30Updated 2 years ago
- Source code for the paper "Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness"☆25Updated 5 years ago
- Implemented CURE algorithm from robustness via curvature regularization and vice versa☆30Updated 2 years ago
- Invertible Concept-based Explanation (ICE)☆18Updated 3 years ago
- Code for paper "Robustness of Bayesian Neural Networks to Gradient-Based Attacks"☆17Updated 11 months ago
- Towards Automatic Concept-based Explanations☆157Updated 9 months ago
- The Pitfalls of Simplicity Bias in Neural Networks [NeurIPS 2020] (http://arxiv.org/abs/2006.07710v2)☆39Updated last year
- Code for the ICLR 2022 paper. Salient Imagenet: How to discover spurious features in deep learning?☆38Updated 2 years ago
- ☆54Updated 4 years ago
- Unofficial implementation of the DeepMind papers "Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples…☆95Updated 2 years ago
- Simple data balancing baselines for worst-group-accuracy benchmarks.☆41Updated last year
- Code for "On Adaptive Attacks to Adversarial Example Defenses"☆86Updated 4 years ago
- Quantitative Testing with Concept Activation Vectors in PyTorch☆42Updated 5 years ago
- Provable Robustness of ReLU networks via Maximization of Linear Regions [AISTATS 2019]☆32Updated 4 years ago