decoderesearch/circuit-tracer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/decoderesearch/circuit-tracer)

decoderesearch / circuit-tracer

☆2,875

Alternatives and similar repositories for circuit-tracer

Users that are interested in circuit-tracer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hijohnnylin / neuronpedia
View on GitHub
open source interpretability platform 🧠
☆1,086Jul 17, 2026Updated last week
decoderesearch / SAELens
View on GitHub
Training Sparse Autoencoders on Language Models
☆1,485Updated this week
TransformerLensOrg / TransformerLens
View on GitHub
A library for mechanistic interpretability of GPT-style language models
☆3,716Updated this week
jacobdunefsky / transcoder_circuits
View on GitHub
☆212Nov 17, 2024Updated last year
ndif-team / nnsight
View on GitHub
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆1,000Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
saprmarks / dictionary_learning
View on GitHub
☆428Aug 21, 2025Updated 11 months ago
jbloomAus / SAEDashboard
View on GitHub
☆109May 23, 2026Updated 2 months ago
EleutherAI / clt-training
View on GitHub
Sparsify transformers with cross-layer transcoders
☆26Nov 14, 2025Updated 8 months ago
EleutherAI / sparsify
View on GitHub
Sparsify transformers with SAEs and transcoders
☆734Updated this week
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆268Updated this week
anthropics / attribution-graphs-frontend
View on GitHub
https://transformer-circuits.pub/2025/attribution-graphs/methods.html
☆103Mar 27, 2025Updated last year
meridianlabs-ai / inspect_petri
View on GitHub
An alignment auditing agent capable of quickly exploring alignment hypothesis
☆1,274Updated this week
adamkarvonen / SAEBench
View on GitHub
☆178May 1, 2026Updated 2 months ago
callummcdougall / ARENA_3.0
View on GitHub
☆1,190Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
EleutherAI / attribute
View on GitHub
☆16Nov 14, 2025Updated 8 months ago
TransformerLensOrg / CircuitsVis
View on GitHub
Mechanistic Interpretability Visualizations using React
☆358Apr 30, 2026Updated 2 months ago
goodfire-ai / r1-interpretability
View on GitHub
Open source interpretability artefacts for R1.
☆183Apr 21, 2025Updated last year
openai / sparse_autoencoder
View on GitHub
☆597Jul 19, 2024Updated 2 years ago
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆267Feb 27, 2026Updated 4 months ago
stanfordnlp / dspy
View on GitHub
DSPy: The framework for programming—not prompting—language models
☆36,388Updated this week
PrimeIntellect-ai / verifiers
View on GitHub
Our library for RL environments + evals
☆4,403Updated this week
goodfire-ai / param-decomp
View on GitHub
Parameter Decomposition
☆136Updated this week
science-of-finetuning / diffing-toolkit
View on GitHub
A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.
☆78Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
andyrdt / refusal_direction
View on GitHub
Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".
☆424Jun 13, 2025Updated last year
kitft / natural_language_autoencoders
View on GitHub
☆909Jun 9, 2026Updated last month
openai / circuit_sparsity
View on GitHub
Open-source release accompanying Gao et al. 2025
☆530Dec 11, 2025Updated 7 months ago
OpenMOSS / Llamascopium
View on GitHub
Performant framework for training, analyzing and visualizing Sparse Autoencoders (SAEs) and their frontier variants.
☆223Updated this week
safety-research / persona_vectors
View on GitHub
Persona Vectors: Monitoring and Controlling Character Traits in Language Models
☆452Apr 22, 2026Updated 3 months ago
ajobi-uhc / seer
View on GitHub
This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix …
☆146Feb 8, 2026Updated 5 months ago
openai / harmony
View on GitHub
Renderer for the harmony response format to be used with gpt-oss
☆4,466Apr 8, 2026Updated 3 months ago
EleutherAI / lm-evaluation-harness
View on GitHub
A framework for few-shot evaluation of language models.
☆13,415Jul 13, 2026Updated last week
ApolloResearch / apd
View on GitHub
Attribution-based Parameter Decomposition
☆35Jun 11, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
TransluceAI / circuits
View on GitHub
ADAG: Transluce's MLP neuron-level circuit tracing library
☆34Apr 10, 2026Updated 3 months ago
rllm-org / rllm
View on GitHub
Democratizing Reinforcement Learning for LLMs
☆5,732Updated this week
safety-research / bloom
View on GitHub
bloom - evaluate any behavior immediately 🌸🌱
☆1,374May 7, 2026Updated 2 months ago
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,667Updated this week
ndif-team / nnterp
View on GitHub
Unified access to Large Language Model modules using NNsight
☆116Updated this week
etredal / openCLT
View on GitHub
☆61Sep 17, 2025Updated 10 months ago
unslothai / unsloth
View on GitHub
Unsloth is a local UI for training and running Gemma 4, Qwen3.6, DeepSeek, Kimi, GLM and other models.
☆68,918Updated this week