openai/circuit_sparsity

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/openai/circuit_sparsity)

openai / circuit_sparsity

Open-source release accompanying Gao et al. 2025

☆531

Alternatives and similar repositories for circuit_sparsity

Users that are interested in circuit_sparsity are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lacoco-lab / decompiling_transformers
View on GitHub
Repo for Paper: Discovering Interpretable Algorithms by Decompiling Transformers to RASP
☆15May 25, 2026Updated last month
decoderesearch / circuit-tracer
View on GitHub
☆2,869Updated this week
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆267Updated this week
curt-tigges / crosslayer-coding
View on GitHub
☆18Jul 9, 2025Updated last year
g-luo / generative_latent_prior
View on GitHub
Official PyTorch Implementation for Learning a Generative Meta-Model of LLM Activations, ICML 2026
☆90Apr 30, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
goodfire-ai / param-decomp
View on GitHub
Parameter Decomposition
☆133Updated this week
EleutherAI / sparsify
View on GitHub
Sparsify transformers with SAEs and transcoders
☆733Updated this week
saprmarks / dictionary_learning
View on GitHub
☆427Aug 21, 2025Updated 11 months ago
OscarXZQ / delta_activations
View on GitHub
Official code release for Delta Activations: A Representation for Finetuned Large Language Models
☆20Sep 5, 2025Updated 10 months ago
decoderesearch / SAELens
View on GitHub
Training Sparse Autoencoders on Language Models
☆1,483Updated this week
facebookresearch / PhysicsLM4
View on GitHub
Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality
☆356May 20, 2026Updated 2 months ago
tilde-research / activault
View on GitHub
Engine for collecting, uploading, and downloading model activations
☆30Apr 2, 2025Updated last year
KellerJordan / modded-nanogpt
View on GitHub
NanoGPT (124M) in 90 seconds
☆5,570Jul 3, 2026Updated 2 weeks ago
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
alxndrTL / IntroRL
View on GitHub
Repo du cours d'introduction à l'apprentissage par renforcement.
☆19Feb 2, 2025Updated last year
ndif-team / nnsight
View on GitHub
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆995Updated this week
TransformerLensOrg / TransformerLens
View on GitHub
A library for mechanistic interpretability of GPT-style language models
☆3,705Updated this week
jbloomAus / SAEDashboard
View on GitHub
☆109May 23, 2026Updated 2 months ago
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆732Jul 4, 2026Updated 2 weeks ago
hijohnnylin / neuronpedia
View on GitHub
open source interpretability platform 🧠
☆1,078Updated this week
goodfire-ai / r1-interpretability
View on GitHub
Open source interpretability artefacts for R1.
☆183Apr 21, 2025Updated last year
yifanzhang-pro / deep-delta-learning
View on GitHub
Official Project Page for Deep Delta Learning (https://arxiv.org/abs/2601.00417)
☆356Jun 15, 2026Updated last month
fjzzq2002 / WeightWatch
View on GitHub
Official Repository of Paper "Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs"
☆15Sep 25, 2025Updated 9 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
kerner-lab / Sparse-GPT-Pretraining
View on GitHub
A codebase for pretraining multi-billion-scale sparse GPTs.
☆24Feb 9, 2026Updated 5 months ago
adamkarvonen / dictionary_learning_demo
View on GitHub
☆26Aug 23, 2025Updated 10 months ago
ckkissane / crosscoder-model-diff-replication
View on GitHub
Open source replication of Anthropic's Crosscoders for Model Diffing
☆68Oct 27, 2024Updated last year
HazyResearch / Megakernels
View on GitHub
Kernels, of the mega variety :)
☆784May 26, 2026Updated last month
science-of-finetuning / sparsity-artifacts-crosscoders
View on GitHub
Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.
☆17Jul 6, 2026Updated 2 weeks ago
openai / harmony
View on GitHub
Renderer for the harmony response format to be used with gpt-oss
☆4,461Apr 8, 2026Updated 3 months ago
KellerJordan / Muon
View on GitHub
Muon is an optimizer for hidden layers in neural networks
☆2,724May 24, 2026Updated last month
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆267Feb 27, 2026Updated 4 months ago
ndif-team / nnterp
View on GitHub
Unified access to Large Language Model modules using NNsight
☆116Jul 2, 2026Updated 3 weeks ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
jacobdunefsky / transcoder_circuits
View on GitHub
☆212Nov 17, 2024Updated last year
deepseek-ai / Engram
View on GitHub
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
☆4,536Jan 14, 2026Updated 6 months ago
goodfire-ai / scribe-task-suite
View on GitHub
A suite of interpretability tasks to evaluate agents using Scribe for notebook access
☆18Oct 2, 2025Updated 9 months ago
PrimeIntellect-ai / verifiers
View on GitHub
Our library for RL environments + evals
☆4,392Updated this week
iliao2345 / CompressARC
View on GitHub
☆226Jan 5, 2026Updated 6 months ago
open-tinker / OpenTinker
View on GitHub
OpenTinker is an RL-as-a-Service infrastructure for foundation models
☆676Mar 21, 2026Updated 4 months ago
safety-research / finetuning-auditor
View on GitHub
Auditing agents for fine-tuning safety
☆21Oct 21, 2025Updated 9 months ago