adamkarvonen/dictionary_learning_demo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/adamkarvonen/dictionary_learning_demo)

adamkarvonen / dictionary_learning_demo

☆26

Alternatives and similar repositories for dictionary_learning_demo

Users that are interested in dictionary_learning_demo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

adamkarvonen / SAE_BoardGameEval
View on GitHub
☆25Jan 28, 2025Updated last year
adamkarvonen / SAEBench
View on GitHub
☆179May 1, 2026Updated 2 months ago
saprmarks / dictionary_learning
View on GitHub
☆428Aug 21, 2025Updated 11 months ago
swei2001 / RouteSAEs
View on GitHub
☆15Jan 2, 2026Updated 6 months ago
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆269Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
bartbussmann / matryoshka_sae
View on GitHub
☆72Jan 17, 2025Updated last year
amudide / switch_sae
View on GitHub
Efficient Dictionary Learning with Switch Sparse Autoencoders (SAEs)
☆25Dec 1, 2024Updated last year
ApolloResearch / e2e_sae
View on GitHub
Sparse Autoencoder Training Library
☆58May 1, 2025Updated last year
oclivegriffin / crosscode
View on GitHub
A library for training crosscoders
☆17May 28, 2025Updated last year
noanabeshima / matryoshka-saes
View on GitHub
☆33Nov 28, 2024Updated last year
OpenMOSS / Llamascopium
View on GitHub
Performant framework for training, analyzing and visualizing Sparse Autoencoders (SAEs) and their frontier variants.
☆225Jul 22, 2026Updated last week
ApolloResearch / deception-detection
View on GitHub
☆44Feb 11, 2025Updated last year
tilde-research / sieve
View on GitHub
Applying SAEs for fine-grained control
☆27Dec 15, 2024Updated last year
tim-lawson / mlsae
View on GitHub
Multi-Layer Sparse Autoencoders (ICLR 2025)
☆30Feb 6, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MaheepChaudhary / SAE-Ravel
View on GitHub
Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…
☆13Jan 26, 2025Updated last year
goodfire-ai / sdxl-turbo-interpretability
View on GitHub
☆49May 27, 2025Updated last year
ckkissane / crosscoder-model-diff-replication
View on GitHub
Open source replication of Anthropic's Crosscoders for Model Diffing
☆68Oct 27, 2024Updated last year
EleutherAI / sparsify
View on GitHub
Sparsify transformers with SAEs and transcoders
☆735Jul 20, 2026Updated last week
AIRI-Institute / SAE-Reasoning
View on GitHub
☆98Mar 28, 2025Updated last year
MikaStars39 / FeatureAlignment
View on GitHub
FeatureAlignment = Alignment + Mechanistic Interpretability
☆35Mar 8, 2025Updated last year
decoderesearch / SAELens
View on GitHub
Training Sparse Autoencoders on Language Models
☆1,487Updated this week
curt-tigges / probity
View on GitHub
☆19Apr 10, 2025Updated last year
hijohnnylin / neuronpedia-scorer
View on GitHub
☆17Feb 14, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
SimengSun / ChapterBreak
View on GitHub
☆12Jun 5, 2024Updated 2 years ago
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆268Feb 27, 2026Updated 5 months ago
jacobdunefsky / llm-steering-opt
View on GitHub
Tools for optimizing steering vectors in LLMs.
☆22Apr 10, 2025Updated last year
cvenhoff / steering-thinking-llms
View on GitHub
☆39Jul 9, 2025Updated last year
Aaquib111 / edge-attribution-patching
View on GitHub
Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"
☆48May 31, 2024Updated 2 years ago
edeyneka / pdf-reader-extension
View on GitHub
☆13Mar 9, 2025Updated last year
princeton-nlp / Edge-Pruning
View on GitHub
[NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".
☆70Aug 15, 2025Updated 11 months ago
HugoFry / mats_sae_training_for_ViTs
View on GitHub
☆25Apr 23, 2024Updated 2 years ago
zihao12 / concept-algebra
View on GitHub
☆27Feb 9, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
allenai / everyday-things
View on GitHub
☆17Dec 6, 2023Updated 2 years ago
peterljq / Tutorial-of-Data-Distillation-and-Condensation
View on GitHub
A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …
☆13Dec 1, 2022Updated 3 years ago
evandez / relations
View on GitHub
How do transformer LMs encode relations?
☆60Feb 24, 2024Updated 2 years ago
ajobi-uhc / seer
View on GitHub
This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix …
☆146Feb 8, 2026Updated 5 months ago
wyshi / lm_privacy
View on GitHub
☆21Sep 21, 2021Updated 4 years ago
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
liushiliushi / ConfTuner
View on GitHub
Official code of ConfTuner: Training Large Language Models to Express Their Confidence Verbally
☆27Sep 26, 2025Updated 10 months ago