rachtibat / LRP-eXplains-TransformersLinks
Layer-wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]
☆196Updated 3 months ago
Alternatives and similar repositories for LRP-eXplains-Transformers
Users that are interested in LRP-eXplains-Transformers are comparing it to the libraries listed below
Sorting:
- Using sparse coding to find distributed representations used by neural networks.☆281Updated last year
- An eXplainable AI toolkit with Concept Relevance Propagation and Relevance Maximization☆136Updated last year
- ☆133Updated 2 weeks ago
- ☆32Updated 11 months ago
- [NeurIPS 2024] CoSy is an automatic evaluation framework for textual explanations of neurons.☆18Updated 4 months ago
- A toolkit for quantitative evaluation of data attribution methods.☆53Updated 3 months ago
- A fast, effective data attribution method for neural networks in PyTorch☆220Updated 11 months ago
- [NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in La…☆16Updated this week
- This repository collects all relevant resources about interpretability in LLMs☆375Updated last year
- Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation☆67Updated 3 years ago
- Mechanistic understanding and validation of large AI models with SemanticLens☆41Updated last month
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆76Updated last year
- ☆355Updated 2 months ago
- Code for the paper: Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery. ECCV 2024.☆49Updated 11 months ago
- Materials for EACL2024 tutorial: Transformer-specific Interpretability☆60Updated last year
- Sparse Autoencoder for Mechanistic Interpretability☆278Updated last year
- A resource repository for representation engineering in large language models☆139Updated 11 months ago
- A Python Data Valuation Package☆30Updated 2 years ago
- MetaQuantus is an XAI performance tool to identify reliable evaluation metrics☆39Updated last year
- Conformal Language Modeling☆32Updated last year
- 👋 Overcomplete is a Vision-based SAE Toolbox☆96Updated 3 months ago
- ☆192Updated 2 weeks ago
- ☆208Updated 11 months ago
- ☆28Updated 11 months ago
- Repository for PURE: Turning Polysemantic Neurons Into Pure Features by Identifying Relevant Circuits, accepted at CVPR 2024 XAI4CV Works…☆19Updated last year
- NeuroSurgeon is a package that enables researchers to uncover and manipulate subnetworks within models in Huggingface Transformers☆41Updated 8 months ago
- ☆108Updated 8 months ago
- An Open Source Implementation of Anthropic's Paper: "Towards Monosemanticity: Decomposing Language Models with Dictionary Learning"☆49Updated last year
- Influence Analysis and Estimation - Survey, Papers, and Taxonomy☆83Updated last year
- Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.☆233Updated 3 months ago