Repository for PURE: Turning Polysemantic Neurons Into Pure Features by Identifying Relevant Circuits, accepted at CVPR 2024 XAI4CV Workshop (spotlight)
☆20May 29, 2024Updated last year
Alternatives and similar repositories for PURE
Users that are interested in PURE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Jun 4, 2025Updated 9 months ago
- ☆14Nov 3, 2025Updated 4 months ago
- [NeurIPS 2024] CoSy is an automatic evaluation framework for textual explanations of neurons.☆19Jan 28, 2026Updated last month
- Code for "Don't trust your eyes: on the (un)reliability of feature visualizations" (ICML 2024)☆34Nov 15, 2023Updated 2 years ago
- Pruning By Explaining Revisited: Optimizing Attribution Methods to Prune CNNs and Transformers, Paper accepted at eXCV workshop of ECCV 2…☆30Jan 6, 2025Updated last year
- Mechanistic understanding and validation of large AI models with SemanticLens☆51Dec 4, 2025Updated 3 months ago
- [TMLR 25] An automated method for explaining complex neuron behaviors in deep vision models using large language models☆10Feb 20, 2025Updated last year
- CoRelAy is a tool to compose small-scale (single-machine) analysis pipelines.☆31Jul 21, 2025Updated 8 months ago
- An eXplainable AI toolkit with Concept Relevance Propagation and Relevance Maximization☆141Jan 14, 2026Updated 2 months ago
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).☆348Jul 23, 2025Updated 8 months ago
- Code for CVPR 2024 Oral "Neural Lineage"☆17Jun 18, 2024Updated last year
- A tiny easily hackable implementation of a feature dashboard.☆16Oct 21, 2025Updated 5 months ago
- Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.☆242Jan 30, 2026Updated last month
- ☆12Jun 12, 2023Updated 2 years ago
- Official implemention of the paper High-Resolution and Precise Counterfactual Medical Image Generation using Language-guided Stable Diffu…☆23Jul 8, 2025Updated 8 months ago
- Layer-wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]☆227Jul 11, 2025Updated 8 months ago
- Official code for "Can We Talk Models Into Seeing the World Differently?" (ICLR 2025).☆28Jan 26, 2025Updated last year
- This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…☆25Feb 16, 2026Updated last month
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Jul 12, 2024Updated last year
- ☆30Mar 13, 2026Updated last week
- Concept Relevance Propagation for Localization Models, accepted at SAIAD workshop at CVPR 2023.☆15Jan 16, 2024Updated 2 years ago
- Implementation of the paper "Improving the Accuracy-Robustness Trade-off of Classifiers via Adaptive Smoothing".☆10Feb 6, 2024Updated 2 years ago
- [NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in La…☆27Nov 3, 2025Updated 4 months ago
- Data for "Datamodels: Predicting Predictions with Training Data"☆97May 25, 2023Updated 2 years ago
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆31Apr 22, 2025Updated 11 months ago
- [IEEE Transactions on Medical Imaging 2024] Harvard Glaucoma Fairness: A Retinal Nerve Disease Dataset for Fairness Learning and Fair Ide…☆26Jan 2, 2025Updated last year
- ☆42Sep 5, 2023Updated 2 years ago
- Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch☆10Aug 7, 2024Updated last year
- Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision Language Models☆20Oct 12, 2025Updated 5 months ago
- Code for the ICLR 2022 paper. Salient Imagenet: How to discover spurious features in deep learning?☆41Aug 19, 2022Updated 3 years ago
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆13Feb 13, 2023Updated 3 years ago
- Understanding Rare Spurious Correlations in Neural Network☆12Jun 5, 2022Updated 3 years ago
- Generative Systems for Art and Design course materials☆12Mar 25, 2020Updated 5 years ago
- ☆52Oct 23, 2023Updated 2 years ago
- Trains small LMs. Designed for training on SimpleStories☆12Sep 15, 2025Updated 6 months ago
- DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.☆12May 29, 2023Updated 2 years ago
- ☆12Jan 10, 2023Updated 3 years ago
- MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing…☆10Oct 7, 2024Updated last year
- [NeurIPS 2023] "Learning to Augment Distributions for Out-of-distribution Detection"☆11Nov 14, 2023Updated 2 years ago