pankessel / adv_explanation_refLinks

reference implementation for "explanations can be manipulated and geometry is to blame"

☆36

Alternatives and similar repositories for adv_explanation_ref

Users that are interested in adv_explanation_ref are comparing it to the libraries listed below

Sorting:

chihkuanyeh / concept_exp
code release for the paper "On Completeness-aware Concept-Based Explanations in Deep Neural Networks"
☆53Updated 3 years ago
rmrisforbidden / Fooling_Neural_Network-Interpretations
This repository provides a PyTorch implementation of "Fooling Neural Network Interpretations via Adversarial Model Manipulation". Our pap…
☆22Updated 4 years ago
princetonvisualai / DomainBiasMitigation
☆74Updated 5 years ago
jcpeterson / cifar-10h
Original dataset release for CIFAR-10H
☆83Updated 4 years ago
cassidylaidlaw / perceptual-advex
Code and data for the ICLR 2021 paper "Perceptual Adversarial Robustness: Defense Against Unseen Threat Models".
☆55Updated 3 years ago
adebayoj / sanity_checks_saliency
☆111Updated 2 years ago
yaircarmon / semisup-adv
Semisupervised learning for adversarial robustness https://arxiv.org/pdf/1905.13736.pdf
☆142Updated 5 years ago
bethgelab / AnalysisBySynthesis
Adversarially Robust Neural Network on MNIST.
☆64Updated 3 years ago
harshays / simplicitybiaspitfalls
The Pitfalls of Simplicity Bias in Neural Networks [NeurIPS 2020] (http://arxiv.org/abs/2006.07710v2)
☆41Updated last year
chihkuanyeh / saliency_evaluation
Python implementation for evaluating explanations presented in "On the (In)fidelity and Sensitivity for Explanations" in NeurIPS 2019 for…
☆25Updated 3 years ago
locuslab / robust_overfitting
☆158Updated 4 years ago
F-Salehi / CURE_robustness
Implemented CURE algorithm from robustness via curvature regularization and vice versa
☆31Updated 2 years ago
zzzace2000 / FIDO-saliency
Explaining Image Classifiers by Counterfactual Generation
☆28Updated 3 years ago
Hadisalman / smoothing-adversarial
Code for our NeurIPS 2019 *spotlight* "Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers"
☆227Updated 5 years ago
imrahulr / adversarial_robustness_pytorch
Unofficial implementation of the DeepMind papers "Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples…
☆97Updated 3 years ago
google-research-datasets / bam
☆51Updated 4 years ago
tml-epfl / understanding-fast-adv-training
Understanding and Improving Fast Adversarial Training [NeurIPS 2020]
☆95Updated 3 years ago
zhangrh93 / InvertibleCE
Invertible Concept-based Explanation (ICE)
☆18Updated 4 years ago
ftramer / Excessive-Invariance
Source code for the paper "Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness"
☆25Updated 5 years ago
LTS4 / neural-anisotropy-directions
Source code for "Neural Anisotropy Directions"
☆16Updated 4 years ago
singlasahil14 / salient_imagenet
Code for the ICLR 2022 paper. Salient Imagenet: How to discover spurious features in deep learning?
☆40Updated 2 years ago
ddkang / advex-uar
Code for "Testing Robustness Against Unforeseen Adversaries"
☆81Updated 11 months ago
laura-rieger / deep-explanation-penalization
Code for using CDEP from the paper "Interpretations are useful: penalizing explanations to align neural networks with prior knowledge" ht…
☆127Updated 4 years ago
watml / fast-wasserstein-adversarial
Implementation of Wasserstein adversarial attacks.
☆23Updated 4 years ago
charan223 / FairDeepLearning
☆37Updated 2 years ago
tml-epfl / adv-training-corruptions
On the effectiveness of adversarial training against common corruptions [UAI 2022]
☆30Updated 3 years ago
yewsiang / ConceptBottleneck
Concept Bottleneck Models, ICML 2020
☆205Updated 2 years ago
YisenWang / MART
Code for ICLR2020 "Improving Adversarial Robustness Requires Revisiting Misclassified Examples"
☆150Updated 4 years ago
locuslab / smoothing
Provable adversarial robustness at ImageNet scale
☆392Updated 6 years ago
amiratag / ACE
Towards Automatic Concept-based Explanations
☆159Updated last year