JoshEngels/SAE-Dark-Matter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JoshEngels/SAE-Dark-Matter)

JoshEngels / SAE-Dark-Matter

Code for our paper "Decomposing The Dark Matter of Sparse Autoencoders"

☆23

Alternatives and similar repositories for SAE-Dark-Matter

Users that are interested in SAE-Dark-Matter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Butanium / tiny-activation-dashboard
View on GitHub
A tiny easily hackable implementation of a feature dashboard.
☆17Oct 21, 2025Updated 9 months ago
wbopan / safety-residual-space
View on GitHub
Multi-dimensional analysis of orthogonal safety directions in LLM alignment
☆22Jun 12, 2026Updated last month
Phylliida / MambaLens
View on GitHub
Mamba support for transformer lens
☆20Sep 17, 2024Updated last year
JoshEngels / SAE-Probes
View on GitHub
Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"
☆33Mar 31, 2025Updated last year
matchten / LoRA-Models-for-SAEs
View on GitHub
Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"
☆17Mar 31, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ThirdAIResearch / Dessert
View on GitHub
DESSERT Effeciently Searches Sets of Embeddings via Retrieval Tables
☆18Feb 21, 2024Updated 2 years ago
XuchanBao / behavioral-self-awareness
View on GitHub
☆37Feb 20, 2025Updated last year
lasr-spelling / sae-spelling
View on GitHub
Code for the paper "A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders"
☆15Dec 28, 2025Updated 6 months ago
slavachalnev / SAE-TS
View on GitHub
Improving Steering Vectors by Targeting Sparse Autoencoder Features
☆28Nov 20, 2024Updated last year
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆266Updated this week
jam3scampbell / llama-lying
View on GitHub
Code for our paper "Localizing Lying in Llama"
☆15Apr 24, 2025Updated last year
krafton-ai / lexico
View on GitHub
KV cache compression via sparse coding
☆17Oct 26, 2025Updated 8 months ago
ordavid-s / decomposing-activations-local-geometry
View on GitHub
☆29May 27, 2026Updated last month
nagornovys / Cancer_cell_evolution
View on GitHub
tugHall: a simulator of cancer cell evolution based on the hallmarks of cancer, linked to the mutational states of tumor-related genes. T…
☆13Dec 11, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
amack315 / unsupervised-steering-vectors
View on GitHub
☆38Apr 30, 2024Updated 2 years ago
SproutNan / AI-Safety_SCAV
View on GitHub
This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"
☆49Oct 13, 2025Updated 9 months ago
bartbussmann / BatchTopK
View on GitHub
Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)
☆67Jul 24, 2025Updated 11 months ago
ApolloResearch / e2e_sae
View on GitHub
Sparse Autoencoder Training Library
☆58May 1, 2025Updated last year
dylanmeysmans / FSharp.Data.Tdms
View on GitHub
TDMS 2.0 support for F# and C#
☆13Dec 26, 2022Updated 3 years ago
EleutherAI / sparsify
View on GitHub
Sparsify transformers with SAEs and transcoders
☆732Updated this week
diovisgood / QGEN_Lua
View on GitHub
Competing Genetic Algorithm to find profitable Trading Strategies on a financial market
☆13May 29, 2019Updated 7 years ago
safety-research / SHADE-Arena
View on GitHub
☆26Jun 22, 2025Updated last year
allenai / understanding_mcqa
View on GitHub
Code for the arXiv preprint "Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions"
☆15Aug 2, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆265Feb 27, 2026Updated 4 months ago
amudide / switch_sae
View on GitHub
Efficient Dictionary Learning with Switch Sparse Autoencoders (SAEs)
☆25Dec 1, 2024Updated last year
kxcloud / gradient-routing
View on GitHub
☆11Dec 4, 2024Updated last year
noanabeshima / matryoshka-saes
View on GitHub
☆33Nov 28, 2024Updated last year
redwoodresearch / Easy-Transformer
View on GitHub
☆148Aug 4, 2024Updated last year
FlyingPumba / InterpBench
View on GitHub
A benchmark for mechanistic discovery of circuits in Transformers
☆17Dec 15, 2024Updated last year
danro9685 / ASCETIC
View on GitHub
ASCETIC (Agony-baSed Cancer EvoluTion InferenCe) is a novel framework for the inference of a set of statistically significant temporal pa…
☆12Apr 18, 2025Updated last year
MaheepChaudhary / SAE-Ravel
View on GitHub
Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…
☆13Jan 26, 2025Updated last year
BatadaLab / scID
View on GitHub
scID
☆14Oct 21, 2021Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
HanSolo9682 / CounterCurate
View on GitHub
This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.
☆19Jun 27, 2024Updated 2 years ago
science-of-finetuning / sparsity-artifacts-crosscoders
View on GitHub
Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.
☆17Jul 6, 2026Updated 2 weeks ago
edenbiran / HoppingTooLate
View on GitHub
Exploring the Limitations of Large Language Models on Multi-Hop Queries
☆33Mar 2, 2025Updated last year
Keyuan125 / CS441-AppliedMachineLearning
View on GitHub
My coding assignment for UIUC-CS441-Applied Machine Learning
☆10Mar 24, 2022Updated 4 years ago
saprmarks / dictionary_learning
View on GitHub
☆427Aug 21, 2025Updated 11 months ago
francescojm / BAGELR
View on GitHub
R implementation of the BAGEL method to call for gene essentiality significance
☆11May 1, 2018Updated 8 years ago
jannik-brinkmann / multilingual-features
View on GitHub
Code for the paper "Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages" (N…
☆17Apr 13, 2025Updated last year