hbaniecki / adversarial-explainable-ai
π‘ Adversarial attacks on explanations and how to defend them
β299Updated 8 months ago
Related projects β
Alternatives and complementary repositories for adversarial-explainable-ai
- Adversarial Attacks on Post Hoc Explanation Techniques (LIME/SHAP)β80Updated last year
- OpenXAI : Towards a Transparent Evaluation of Model Explanationsβ232Updated 3 months ago
- All about explainable AI, algorithmic fairness and moreβ107Updated last year
- Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanationsβ558Updated last week
- Interesting resources related to Explainable Artificial Intelligence, Interpretable Machine Learning, Interactive Machine Learning, Humanβ¦β72Updated 2 years ago
- A Python library for Secure and Explainable Machine Learningβ153Updated last week
- A library for experimenting with, training and evaluating neural networks, with a focus on adversarial robustness.β918Updated 10 months ago
- A curated list of awesome Fairness in AI resourcesβ314Updated last year
- RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]β667Updated 2 weeks ago
- Code for using CDEP from the paper "Interpretations are useful: penalizing explanations to align neural networks with prior knowledge" htβ¦β127Updated 3 years ago
- A curated list of papers on adversarial machine learning (adversarial examples and defense methods).β211Updated 2 years ago
- reference implementation for "explanations can be manipulated and geometry is to blame"β35Updated 2 years ago
- Reference tables to introduce and organize evaluation methods and measures for explainable machine learning systemsβ73Updated 2 years ago
- β121Updated 2 years ago
- Related papers for robust machine learningβ564Updated last year
- Code for "On Adaptive Attacks to Adversarial Example Defenses"β85Updated 3 years ago
- This repository provides simple PyTorch implementations for adversarial training methods on CIFAR-10.β155Updated 3 years ago
- Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" π§ (ICLR 2019)β125Updated 3 years ago
- β565Updated last year
- Provable adversarial robustness at ImageNet scaleβ368Updated 5 years ago
- β140Updated last month
- A repository to quickly generate synthetic data and associated trojaned deep learning modelsβ74Updated last year
- Creating and defending against adversarial examplesβ42Updated 5 years ago
- A unified benchmark problem for data poisoning attacksβ151Updated last year
- Library containing PyTorch implementations of various adversarial attacks and resourcesβ149Updated last month
- List of relevant resources for machine learning from explanatory supervisionβ152Updated 4 months ago
- CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithmsβ283Updated last year
- An amortized approach for calculating local Shapley value explanationsβ92Updated 11 months ago
- Code relative to "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks"β656Updated 6 months ago
- LaTeX source for the paper "On Evaluating Adversarial Robustness"β253Updated 3 years ago