rachtibat / LRP-eXplains-TransformersLinks
Layer-Wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]
☆172Updated this week
Alternatives and similar repositories for LRP-eXplains-Transformers
Users that are interested in LRP-eXplains-Transformers are comparing it to the libraries listed below
Sorting:
- An eXplainable AI toolkit with Concept Relevance Propagation and Relevance Maximization☆130Updated last year
- A toolkit for quantitative evaluation of data attribution methods.☆49Updated last week
- [NeurIPS 2024] CoSy is an automatic evaluation framework for textual explanations of neurons.☆16Updated 3 weeks ago
- Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation☆63Updated 3 years ago
- A fast, effective data attribution method for neural networks in PyTorch☆212Updated 7 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆71Updated 9 months ago
- This repository collects all relevant resources about interpretability in LLMs☆362Updated 8 months ago
- Using sparse coding to find distributed representations used by neural networks.☆259Updated last year
- A simple PyTorch implementation of influence functions.☆89Updated last year
- Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.☆227Updated this week
- ☆31Updated 7 months ago
- A resource repository for representation engineering in large language models☆127Updated 8 months ago
- Code for the paper "Post-hoc Concept Bottleneck Models". Spotlight @ ICLR 2023☆79Updated last year
- ☆105Updated last month
- ☆51Updated 4 months ago
- The one-stop repository for large language model (LLM) unlearning. Supports TOFU, MUSE, WMDP, and many unlearning methods. All features: …☆313Updated 2 weeks ago
- ☆137Updated last year
- A Python Data Valuation Package☆31Updated 2 years ago
- Materials for EACL2024 tutorial: Transformer-specific Interpretability☆57Updated last year
- Editing Models with Task Arithmetic☆482Updated last year
- ☆99Updated 5 months ago
- General-purpose activation steering library☆83Updated 2 months ago
- `dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.☆78Updated last month
- A basic implementation of Layer-wise Relevance Propagation (LRP) in PyTorch.☆96Updated 2 years ago
- [ICLR 23] A new framework to transform any neural networks into an interpretable concept-bottleneck-model (CBM) without needing labeled c…☆106Updated last year
- Sparse Autoencoder for Mechanistic Interpretability☆255Updated 11 months ago
- Concept Bottleneck Models, ICML 2020☆205Updated 2 years ago
- Code for the paper: Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery. ECCV 2024.☆46Updated 8 months ago
- Sparse probing paper full code.☆58Updated last year
- ☆314Updated last month