decoderesearch/automated-interpretability

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/decoderesearch/automated-interpretability)

decoderesearch / automated-interpretability

☆24

Alternatives and similar repositories for automated-interpretability

Users that are interested in automated-interpretability are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

alexjfoote / Neuron2Graph
View on GitHub
Tools for exploring Transformer neuron behaviour, including input pruning and diversification.
☆10Jun 6, 2023Updated 3 years ago
EleutherAI / attribute
View on GitHub
☆16Nov 14, 2025Updated 8 months ago
hijohnnylin / neuronpedia-scorer
View on GitHub
☆17Feb 14, 2024Updated 2 years ago
ApolloResearch / e2e_sae
View on GitHub
Sparse Autoencoder Training Library
☆58May 1, 2025Updated last year
noanabeshima / tinymodel
View on GitHub
A TinyStories LM with SAEs and transcoders
☆14Apr 3, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
curt-tigges / crosslayer-coding
View on GitHub
☆18Jul 9, 2025Updated last year
adamkarvonen / SAEBench
View on GitHub
☆177May 1, 2026Updated 2 months ago
neelnanda-io / Neuroscope
View on GitHub
Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons
☆14Feb 13, 2023Updated 3 years ago
goodfire-ai / scribe-task-suite
View on GitHub
A suite of interpretability tasks to evaluate agents using Scribe for notebook access
☆18Oct 2, 2025Updated 9 months ago
ApolloResearch / sample
View on GitHub
Repository with sample code using Apollo's suggested engineering practices
☆15Dec 16, 2024Updated last year
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆265Feb 27, 2026Updated 4 months ago
callummcdougall / sae-exercises-mats
View on GitHub
☆26Dec 20, 2023Updated 2 years ago
ndif-team / ndif
View on GitHub
The NDIF server, which performs deep inference and serves nnsight requests remotely
☆50Updated this week
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Butanium / tiny-activation-dashboard
View on GitHub
A tiny easily hackable implementation of a feature dashboard.
☆17Oct 21, 2025Updated 9 months ago
interp-reasoning / thought-anchors
View on GitHub
⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.
☆137Oct 27, 2025Updated 8 months ago
rloganiv / kglm-data
View on GitHub
Code used to create the Linked WikiText-2 dataset
☆16May 22, 2023Updated 3 years ago
goodfire-ai / scribe
View on GitHub
☆85Feb 18, 2026Updated 5 months ago
jbloomAus / SAEDashboard
View on GitHub
☆109May 23, 2026Updated last month
decoderesearch / SAELens
View on GitHub
Training Sparse Autoencoders on Language Models
☆1,476Updated this week
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆266Jul 13, 2026Updated last week
neelnanda-io / 1L-Sparse-Autoencoder
View on GitHub
☆141Oct 28, 2023Updated 2 years ago
Xu0615 / FinetuneCircuits
View on GitHub
A Mechanistic‑Interpretability study that finds the structural dynamics of Large Language Models under fine‑tuning.
☆17May 30, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Aaquib111 / edge-attribution-patching
View on GitHub
Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"
☆48May 31, 2024Updated 2 years ago
adamkarvonen / SAE_BoardGameEval
View on GitHub
☆25Jan 28, 2025Updated last year
googleinterns / localizing-paragraph-memorization
View on GitHub
☆15Feb 21, 2024Updated 2 years ago
bartbussmann / BatchTopK
View on GitHub
Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)
☆67Jul 24, 2025Updated 11 months ago
marlic7 / reveal.js-online
View on GitHub
A simple app that combines Ace Editor and RevealJS. You can write markdown on the left, and preview your presentation on the right.
☆11Mar 17, 2021Updated 5 years ago
recursivelabsai / Self-Tracing
View on GitHub
Building on Anthropic's Circuit Tracer, Neuronpedia, Ameisen et al. (2025) and Lindsey et al. (2025), we attempt to extend the paradigm w…
☆74Aug 1, 2025Updated 11 months ago
ai-safety-foundation / sparse_autoencoder
View on GitHub
Sparse Autoencoder for Mechanistic Interpretability
☆303Jul 20, 2024Updated 2 years ago
oclivegriffin / crosscode
View on GitHub
A library for training crosscoders
☆17May 28, 2025Updated last year
curt-tigges / probity
View on GitHub
☆19Apr 10, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
saprmarks / feature-circuits
View on GitHub
☆223Oct 14, 2025Updated 9 months ago
LLM-MI-Research / Actionable-MI
View on GitHub
☆15Jan 20, 2026Updated 6 months ago
jettjaniak / chainscope
View on GitHub
Repository for the "Chain-of-Thought Reasoning In The Wild Is Not Always Faithful" paper
☆35Mar 31, 2026Updated 3 months ago
ndif-team / nnsight
View on GitHub
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆995Updated this week
writing-assistant / writing-assistant.github.io
View on GitHub
☆18Sep 3, 2024Updated last year
EleutherAI / sparsify
View on GitHub
Sparsify transformers with SAEs and transcoders
☆732Updated this week
YanniKouloumbis / next-js-window-ai
View on GitHub
A Next.js chatbot app demonstrating seamless integration with window.ai.
☆15Jun 25, 2023Updated 3 years ago