koayon/atp_star

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/koayon/atp_star)

koayon / atp_star

PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)

☆20

Alternatives and similar repositories for atp_star

Users that are interested in atp_star are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Butanium / tiny-activation-dashboard
View on GitHub
A tiny easily hackable implementation of a feature dashboard.
☆17Oct 21, 2025Updated 9 months ago
FlyingPumba / InterpBench
View on GitHub
A benchmark for mechanistic discovery of circuits in Transformers
☆17Dec 15, 2024Updated last year
EleutherAI / mdl
View on GitHub
Minimum Description Length probing for neural network representations
☆20Jan 28, 2025Updated last year
UFO-101 / auto-circuit
View on GitHub
A library for efficient patching and automatic circuit discovery.
☆99Dec 31, 2025Updated 6 months ago
slavachalnev / SAE-TS
View on GitHub
Improving Steering Vectors by Targeting Sparse Autoencoder Features
☆29Nov 20, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
ApolloResearch / apd
View on GitHub
Attribution-based Parameter Decomposition
☆35Jun 11, 2025Updated last year
science-of-finetuning / sparsity-artifacts-crosscoders
View on GitHub
Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.
☆17Jul 6, 2026Updated 2 weeks ago
curt-tigges / probity
View on GitHub
☆19Apr 10, 2025Updated last year
Phylliida / MambaLens
View on GitHub
Mamba support for transformer lens
☆20Sep 17, 2024Updated last year
kdu4108 / context-vs-prior-finetuning
View on GitHub
☆15May 27, 2025Updated last year
understanding-search / structured-representations-maze-transformers
View on GitHub
see github.com/understanding-search/maze-transformer
☆10Dec 8, 2023Updated 2 years ago
science-of-finetuning / crosscoder_learning
View on GitHub
Modified to support crosscoder training.
☆27Jul 2, 2026Updated 3 weeks ago
cvndsh / rebus
View on GitHub
REBUS: A Robust Evaluation Benchmark of Understanding Symbols
☆13Aug 13, 2024Updated last year
tripos-education / maths-tripos-questions
View on GitHub
Archive of questions from the Cambridge Mathematics Tripos
☆10Jun 6, 2022Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
callummcdougall / path_patching
View on GitHub
Implementation of path patching & activation patching (will eventually add to TransformerLens).
☆15Jan 8, 2024Updated 2 years ago
ag8 / sha-transformer
View on GitHub
☆12Jul 8, 2024Updated 2 years ago
yash-srivastava19 / arrakis
View on GitHub
Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.
☆31Jul 8, 2026Updated 2 weeks ago
MaheepChaudhary / SAE-Ravel
View on GitHub
Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…
☆13Jan 26, 2025Updated last year
EleutherAI / attribute
View on GitHub
☆16Nov 14, 2025Updated 8 months ago
tilde-research / activault
View on GitHub
Engine for collecting, uploading, and downloading model activations
☆30Apr 2, 2025Updated last year
koayon / awesome-sparse-autoencoders
View on GitHub
A curated reading list of research in Sparse Autoencoders, Feature Extraction and related topics in Mechanistic Interpretability
☆33Jan 30, 2025Updated last year
UKGovernmentBEIS / hibayes
View on GitHub
☆53May 17, 2026Updated 2 months ago
science-of-finetuning / diffing-toolkit
View on GitHub
A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.
☆78Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ndif-team / nnterp
View on GitHub
Unified access to Large Language Model modules using NNsight
☆116Updated this week
apartresearch / specificityplus
View on GitHub
👩‍💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"
☆20Jan 19, 2024Updated 2 years ago
Aaquib111 / edge-attribution-patching
View on GitHub
Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"
☆48May 31, 2024Updated 2 years ago
msakarvadia / AttentionLens
View on GitHub
Interpretating the latent space representations of attention head outputs for LLMs
☆39Aug 13, 2024Updated last year
hijohnnylin / neuronpedia-scorer
View on GitHub
☆17Feb 14, 2024Updated 2 years ago
ARBORproject / arborproject.github.io
View on GitHub
☆86Feb 25, 2025Updated last year
ArthurConmy / Automatic-Circuit-Discovery
View on GitHub
☆293Oct 1, 2024Updated last year
jacobdunefsky / llm-steering-opt
View on GitHub
Tools for optimizing steering vectors in LLMs.
☆22Apr 10, 2025Updated last year
ApolloResearch / e2e_sae
View on GitHub
Sparse Autoencoder Training Library
☆58May 1, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Chillee / lit-llama
View on GitHub
Simple (fast) transformer inference in PyTorch with torch.compile + lit-llama code
☆10Aug 29, 2023Updated 2 years ago
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆267Feb 27, 2026Updated 4 months ago
flamewei123 / DEPN
View on GitHub
☆24Apr 20, 2024Updated 2 years ago
oclivegriffin / crosscode
View on GitHub
A library for training crosscoders
☆17May 28, 2025Updated last year
curt-tigges / crosslayer-coding
View on GitHub
☆18Jul 9, 2025Updated last year
harish-kamath / rqae
View on GitHub
Residual Quantization Autoencoder, used for interpreting LLMs
☆14Jan 1, 2025Updated last year
hannamw / EAP-IG
View on GitHub
☆83May 23, 2026Updated 2 months ago