dmitrykazhdan / CME
CME: Concept-based Model Extraction
☆12Updated 4 years ago
Alternatives and similar repositories for CME:
Users that are interested in CME are comparing it to the libraries listed below
- DISSECT: Disentangled Simultaneous Explanations via Concept Traversals☆11Updated last year
- This repository contains the implementation of Concept Activation Regions, a new framework to explain deep neural networks with human con…☆11Updated 2 years ago
- code release for the paper "On Completeness-aware Concept-Based Explanations in Deep Neural Networks"☆53Updated 3 years ago
- ☆38Updated 3 years ago
- Library implementing state-of-the-art Concept-based and Disentanglement Learning methods for Explainable AI☆54Updated 2 years ago
- ☆16Updated 2 years ago
- Self-Explaining Neural Networks☆13Updated last year
- CVPR'19 experiments with (on-manifold) adversarial examples.☆44Updated 5 years ago
- Understanding Rare Spurious Correlations in Neural Network☆12Updated 2 years ago
- Implementation of the paper "A Framework for Learning Ante-hoc Explainable Models via Concepts" (CVPR 2022).☆8Updated 9 months ago
- 'Robust Semantic Interpretability: Revisiting Concept Activation Vectors' Official Implementation☆11Updated 4 years ago
- Code for the ICLR 2022 paper "Attention-based interpretability with Concept Transformers"☆40Updated 2 years ago
- Code for "Generative causal explanations of black-box classifiers"☆34Updated 4 years ago
- Code for Environment Inference for Invariant Learning (ICML 2021 Paper)☆50Updated 3 years ago
- Overlooked Factors in Concept-based Explanations: Dataset Choice, Concept Learnability, and Human Capability (CVPR 2023)☆9Updated 2 years ago
- ☆45Updated 2 years ago
- On the effectiveness of adversarial training against common corruptions [UAI 2022]☆30Updated 2 years ago
- Code for "Interpretable Image Recognition with Hierarchical Prototypes"☆18Updated 5 years ago
- This repository provides a PyTorch implementation of "Fooling Neural Network Interpretations via Adversarial Model Manipulation". Our pap…☆22Updated 4 years ago
- Codes for reproducing the experimental results in "Proper Network Interpretability Helps Adversarial Robustness in Classification", publi…☆13Updated 4 years ago
- Code for the paper "Getting a CLUE: A Method for Explaining Uncertainty Estimates"☆35Updated 11 months ago
- [ICLR'22] Self-supervised learning optimally robust representations for domain shift.☆23Updated 3 years ago
- reference implementation for "explanations can be manipulated and geometry is to blame"☆36Updated 2 years ago
- Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations is a ServiceNow Research project that was started at Elemen…☆13Updated last year
- Python implementation for evaluating explanations presented in "On the (In)fidelity and Sensitivity for Explanations" in NeurIPS 2019 for…☆25Updated 3 years ago
- Official implementation for Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds (NeurIPS, 2021).☆23Updated 2 years ago
- Explaining Image Classifiers by Counterfactual Generation☆28Updated 3 years ago
- Do input gradients highlight discriminative features? [NeurIPS 2021] (https://arxiv.org/abs/2102.12781)☆13Updated 2 years ago
- PyTorch code for KDD 18 paper: Towards Explanation of DNN-based Prediction with Guided Feature Inversion☆21Updated 6 years ago
- On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them [NeurIPS 2020]☆36Updated 3 years ago