jacobdunefsky/transcoder_circuits

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jacobdunefsky/transcoder_circuits)

jacobdunefsky / transcoder_circuits

☆212

Alternatives and similar repositories for transcoder_circuits

Users that are interested in transcoder_circuits are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

decoderesearch / SAELens
View on GitHub
Training Sparse Autoencoders on Language Models
☆1,477Updated this week
saprmarks / dictionary_learning
View on GitHub
☆427Aug 21, 2025Updated 11 months ago
saprmarks / feature-circuits
View on GitHub
☆223Oct 14, 2025Updated 9 months ago
etredal / openCLT
View on GitHub
☆61Sep 17, 2025Updated 10 months ago
openai / sparse_autoencoder
View on GitHub
☆595Jul 19, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆266Updated this week
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆265Feb 27, 2026Updated 4 months ago
EleutherAI / sparsify
View on GitHub
Sparsify transformers with SAEs and transcoders
☆732Updated this week
TransformerLensOrg / TransformerLens
View on GitHub
A library for mechanistic interpretability of GPT-style language models
☆3,695Updated this week
HoagyC / sparse_coding
View on GitHub
Using sparse coding to find distributed representations used by neural networks.
☆306Nov 10, 2023Updated 2 years ago
TransformerLensOrg / CircuitsVis
View on GitHub
Mechanistic Interpretability Visualizations using React
☆358Apr 30, 2026Updated 2 months ago
EleutherAI / clt-training
View on GitHub
Sparsify transformers with cross-layer transcoders
☆26Nov 14, 2025Updated 8 months ago
decoderesearch / circuit-tracer
View on GitHub
☆2,868Updated this week
anthropics / attribution-graphs-frontend
View on GitHub
https://transformer-circuits.pub/2025/attribution-graphs/methods.html
☆103Mar 27, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ApolloResearch / e2e_sae
View on GitHub
Sparse Autoencoder Training Library
☆58May 1, 2025Updated last year
ndif-team / nnsight
View on GitHub
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆995Updated this week
JoshEngels / MultiDimensionalFeatures
View on GitHub
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆90Nov 27, 2024Updated last year
EleutherAI / attribute
View on GitHub
☆16Nov 14, 2025Updated 8 months ago
neelnanda-io / Crosscoders
View on GitHub
☆60Nov 19, 2024Updated last year
cooperleong00 / Awesome-LLM-Interpretability
View on GitHub
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
☆308Jan 22, 2026Updated 6 months ago
adamkarvonen / SAEBench
View on GitHub
☆177May 1, 2026Updated 2 months ago
explanare / ravel
View on GitHub
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆58Oct 30, 2025Updated 8 months ago
ckkissane / crosscoder-model-diff-replication
View on GitHub
Open source replication of Anthropic's Crosscoders for Model Diffing
☆68Oct 27, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
OpenMOSS / Llamascopium
View on GitHub
Performant framework for training, analyzing and visualizing Sparse Autoencoders (SAEs) and their frontier variants.
☆223Updated this week
ai-safety-foundation / sparse_autoencoder
View on GitHub
Sparse Autoencoder for Mechanistic Interpretability
☆303Jul 20, 2024Updated 2 years ago
adamkarvonen / activation_oracles
View on GitHub
☆95Apr 18, 2026Updated 3 months ago
KihoPark / linear_rep_geometry
View on GitHub
Code for 'The Linear Representation Hypothesis and the Geometry of Large Language Models' (ICML 2024)
☆125Feb 11, 2025Updated last year
goodfire-ai / r1-interpretability
View on GitHub
Open source interpretability artefacts for R1.
☆183Apr 21, 2025Updated last year
curt-tigges / crosslayer-coding
View on GitHub
☆18Jul 9, 2025Updated last year
Prisma-Multimodal / ViT-Prisma
View on GitHub
ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).
☆378Jul 23, 2025Updated 11 months ago
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated 2 years ago
wesg52 / llm-context-neurons
View on GitHub
Find context neurons in Pythia models.
☆13Jun 13, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
callummcdougall / sae-exercises-mats
View on GitHub
☆26Dec 20, 2023Updated 2 years ago
JoshEngels / SAE-Probes
View on GitHub
Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"
☆33Mar 31, 2025Updated last year
stanfordnlp / axbench
View on GitHub
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
☆210Mar 12, 2026Updated 4 months ago
stanfordnlp / pyvene
View on GitHub
Stanford NLP Python library for understanding and improving PyTorch models via interventions
☆892Mar 6, 2026Updated 4 months ago
AIRI-Institute / SAE-Reasoning
View on GitHub
☆99Mar 28, 2025Updated last year
fjzzq2002 / random_transformers
View on GitHub
Official code for "Algorithmic Capabilities of Random Transformers" (NeurIPS 2024)
☆15Sep 28, 2024Updated last year
bartbussmann / BatchTopK
View on GitHub
Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)
☆67Jul 24, 2025Updated 11 months ago