ApolloResearch/apd

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ApolloResearch/apd)

ApolloResearch / apd

Attribution-based Parameter Decomposition

☆35

Alternatives and similar repositories for apd

Users that are interested in apd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tilde-research / activault
View on GitHub
Engine for collecting, uploading, and downloading model activations
☆30Apr 2, 2025Updated last year
koayon / atp_star
View on GitHub
PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)
☆20Jan 19, 2025Updated last year
goodfire-ai / param-decomp
View on GitHub
Parameter Decomposition
☆133Updated this week
adamkarvonen / SAEBench
View on GitHub
☆177May 1, 2026Updated 2 months ago
EleutherAI / bergson
View on GitHub
Mapping out the "memory" of neural nets with data attribution
☆69Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ckkissane / crosscoder-model-diff-replication
View on GitHub
Open source replication of Anthropic's Crosscoders for Model Diffing
☆68Oct 27, 2024Updated last year
Butanium / tiny-activation-dashboard
View on GitHub
A tiny easily hackable implementation of a feature dashboard.
☆17Oct 21, 2025Updated 8 months ago
MaheepChaudhary / SAE-Ravel
View on GitHub
Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…
☆13Jan 26, 2025Updated last year
RobertCsordas / onion_representations
View on GitHub
☆13Aug 19, 2024Updated last year
saprmarks / dictionary_learning
View on GitHub
☆427Aug 21, 2025Updated 10 months ago
goodfire-ai / scribe-task-suite
View on GitHub
A suite of interpretability tasks to evaluate agents using Scribe for notebook access
☆18Oct 2, 2025Updated 9 months ago
FlyingPumba / InterpBench
View on GitHub
A benchmark for mechanistic discovery of circuits in Transformers
☆17Dec 15, 2024Updated last year
mpsae / MP-SAE
View on GitHub
☆17May 19, 2026Updated 2 months ago
UFO-101 / auto-circuit
View on GitHub
A library for efficient patching and automatic circuit discovery.
☆99Dec 31, 2025Updated 6 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hannamw / MIB-circuit-track
View on GitHub
☆24Jun 30, 2025Updated last year
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆266Jul 13, 2026Updated last week
ndif-team / nnsight
View on GitHub
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆995Updated this week
JoshEngels / MultiDimensionalFeatures
View on GitHub
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆90Nov 27, 2024Updated last year
oclivegriffin / crosscode
View on GitHub
A library for training crosscoders
☆17May 28, 2025Updated last year
Aaquib111 / edge-attribution-patching
View on GitHub
Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"
☆48May 31, 2024Updated 2 years ago
Heidelberg-NLP / CC-SHAP
View on GitHub
Code for "On Measuring Faithfulness of Natural Language Explanations"
☆23Updated this week
saprmarks / feature-circuits
View on GitHub
☆223Oct 14, 2025Updated 9 months ago
EleutherAI / attribute
View on GitHub
☆16Nov 14, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
explanare / ravel
View on GitHub
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆58Oct 30, 2025Updated 8 months ago
ARBORproject / arborproject.github.io
View on GitHub
☆86Feb 25, 2025Updated last year
chanind / linear-relational
View on GitHub
Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch
☆11Aug 7, 2024Updated last year
ordavid-s / snmf-mlp-decomposition
View on GitHub
☆15Jul 7, 2026Updated last week
anthropics / sycophancy-to-subterfuge-paper
View on GitHub
☆28Sep 5, 2024Updated last year
science-of-finetuning / sparsity-artifacts-crosscoders
View on GitHub
Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.
☆17Jul 6, 2026Updated last week
EleutherAI / mdl
View on GitHub
Minimum Description Length probing for neural network representations
☆20Jan 28, 2025Updated last year
raybears / cot-transparency
View on GitHub
Improving transparency of large language models' reasoning
☆15Nov 25, 2025Updated 7 months ago
gumityolcu / dualxda-pip
View on GitHub
PyPI package for DualXDA for efficient data attribution and feature-level explanations of training data influence
☆22Mar 6, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ApolloResearch / e2e_sae
View on GitHub
Sparse Autoencoder Training Library
☆58May 1, 2025Updated last year
edenbiran / HoppingTooLate
View on GitHub
Exploring the Limitations of Large Language Models on Multi-Hop Queries
☆33Mar 2, 2025Updated last year
clarifying-EM / model-organisms-for-EM
View on GitHub
Code repo for the model organisms and convergent directions of EM papers.
☆72Sep 22, 2025Updated 9 months ago
ejnnr / cupbearer
View on GitHub
A library for mechanistic anomaly detection
☆22Jan 9, 2025Updated last year
PalisadeResearch / ctfish
View on GitHub
Chess agent specification gaming
☆25Updated this week
ajobi-uhc / seer
View on GitHub
This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix …
☆146Feb 8, 2026Updated 5 months ago
timaeus-research / devinterp
View on GitHub
Tools for studying developmental interpretability in neural networks.
☆145Apr 23, 2026Updated 2 months ago