rmrisforbidden / Fooling_Neural_Network-Interpretations
This repository provides a PyTorch implementation of "Fooling Neural Network Interpretations via Adversarial Model Manipulation". Our paper has been accepted to NeurIPS 2019.
☆22Updated 4 years ago
Alternatives and similar repositories for Fooling_Neural_Network-Interpretations:
Users that are interested in Fooling_Neural_Network-Interpretations are comparing it to the libraries listed below
- code release for the paper "On Completeness-aware Concept-Based Explanations in Deep Neural Networks"☆53Updated 3 years ago
- reference implementation for "explanations can be manipulated and geometry is to blame"☆36Updated 2 years ago
- On the effectiveness of adversarial training against common corruptions [UAI 2022]☆30Updated 2 years ago
- Code for the paper "Adversarial Neural Pruning with Latent Vulnerability Suppression"☆14Updated 2 years ago
- Provable Robustness of ReLU networks via Maximization of Linear Regions [AISTATS 2019]☆32Updated 4 years ago
- Interpretation of Neural Network is Fragile☆36Updated 11 months ago
- Understanding and Improving Fast Adversarial Training [NeurIPS 2020]☆96Updated 3 years ago
- Codes for reproducing the experimental results in "Proper Network Interpretability Helps Adversarial Robustness in Classification", publi…☆13Updated 4 years ago
- [ICLR 2021] "Robust Overfitting may be mitigated by properly learned smoothening" by Tianlong Chen*, Zhenyu Zhang*, Sijia Liu, Shiyu Chan…☆46Updated 3 years ago
- Pre-Training Buys Better Robustness and Uncertainty Estimates (ICML 2019)☆100Updated 3 years ago
- Code for the paper "Understanding Generalization through Visualizations"☆60Updated 4 years ago
- Semisupervised learning for adversarial robustness https://arxiv.org/pdf/1905.13736.pdf☆141Updated 5 years ago
- Pytorch implementation of Adversarially Robust Distillation (ARD)☆59Updated 5 years ago
- Code for the paper "MMA Training: Direct Input Space Margin Maximization through Adversarial Training"☆34Updated 5 years ago
- Code and data for the ICLR 2021 paper "Perceptual Adversarial Robustness: Defense Against Unseen Threat Models".☆55Updated 3 years ago
- PyTorch implementations of Adversarial defenses and utils.☆34Updated last year
- Adversarial Defense for Ensemble Models (ICML 2019)☆61Updated 4 years ago
- Max Mahalanobis Training (ICML 2018 + ICLR 2020)☆90Updated 4 years ago
- Do input gradients highlight discriminative features? [NeurIPS 2021] (https://arxiv.org/abs/2102.12781)☆13Updated 2 years ago
- Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2…☆22Updated 4 years ago
- Implementation of Confidence-Calibrated Adversarial Training (CCAT).☆45Updated 4 years ago
- A Closer Look at Accuracy vs. Robustness☆88Updated 3 years ago
- ☆14Updated 5 years ago
- Understanding Catastrophic Overfitting in Single-step Adversarial Training [AAAI 2021]☆28Updated 2 years ago
- [ICML'20] Multi Steepest Descent (MSD) for robustness against the union of multiple perturbation models.☆26Updated 9 months ago
- Implemented CURE algorithm from robustness via curvature regularization and vice versa☆31Updated 2 years ago
- Code release for the ICML 2019 paper "Are generative classifiers more robust to adversarial attacks?"☆23Updated 5 years ago
- Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks, in ICCV 2019☆59Updated 5 years ago
- The Pitfalls of Simplicity Bias in Neural Networks [NeurIPS 2020] (http://arxiv.org/abs/2006.07710v2)☆39Updated last year
- Python implementation for evaluating explanations presented in "On the (In)fidelity and Sensitivity for Explanations" in NeurIPS 2019 for…☆25Updated 3 years ago