safety-research / circuit-tracer
View external linksLinks

☆2,595

Alternatives and similar repositories for circuit-tracer

Users that are interested in circuit-tracer are comparing it to the libraries listed below

Sorting:

hijohnnylin / neuronpedia
View on GitHub
open source interpretability platform 🧠
☆704Updated this week
TransformerLensOrg / TransformerLens
View on GitHub
A library for mechanistic interpretability of GPT-style language models
☆3,073Updated this week
decoderesearch / SAELens
View on GitHub
Training Sparse Autoencoders on Language Models
☆1,201Updated this week
EleutherAI / sparsify
View on GitHub
Sparsify transformers with SAEs and transcoders
☆692Feb 9, 2026Updated last week
Butanium / tiny-activation-dashboard
View on GitHub
A tiny easily hackable implementation of a feature dashboard.
☆15Oct 21, 2025Updated 3 months ago
jacobdunefsky / transcoder_circuits
View on GitHub
☆198Nov 17, 2024Updated last year
stanfordnlp / dspy
View on GitHub
DSPy: The framework for programming—not prompting—language models
☆32,156Updated this week
ndif-team / nnsight
View on GitHub
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆811Updated this week
goodfire-ai / r1-interpretability
View on GitHub
Open source interpretability artefacts for R1.
☆170Apr 21, 2025Updated 9 months ago
ApolloResearch / apd
View on GitHub
Attribution-based Parameter Decomposition
☆33Jun 11, 2025Updated 8 months ago
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆241Feb 9, 2026Updated last week
adamkarvonen / SAEBench
View on GitHub
☆146Dec 30, 2025Updated last month
arcee-ai / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆6,783Jan 26, 2026Updated 3 weeks ago
anthropics / attribution-graphs-frontend
View on GitHub
https://transformer-circuits.pub/2025/attribution-graphs/methods.html
☆91Mar 27, 2025Updated 10 months ago
saprmarks / dictionary_learning
View on GitHub
☆394Aug 21, 2025Updated 5 months ago
openai / sparse_autoencoder
View on GitHub
☆570Jul 19, 2024Updated last year
openai / transformer-debugger
View on GitHub
☆4,109Jun 4, 2024Updated last year
andyrdt / refusal_direction
View on GitHub
Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".
☆342Jun 13, 2025Updated 8 months ago
Jiayi-Pan / TinyZero
View on GitHub
Minimal reproduction of DeepSeek R1-Zero
☆12,748Apr 24, 2025Updated 9 months ago
openai / swarm
View on GitHub
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
☆20,950Mar 11, 2025Updated 11 months ago
sgl-project / sglang
View on GitHub
SGLang is a high-performance serving framework for large language models and multimodal models.
☆23,547Updated this week
PrimeIntellect-ai / verifiers
View on GitHub
Our library for RL environments + evals
☆3,833Updated this week
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆243Dec 16, 2024Updated last year
saprmarks / feature-circuits
View on GitHub
☆207Oct 14, 2025Updated 4 months ago
stanfordnlp / pyreft
View on GitHub
Stanford NLP Python library for Representation Finetuning (ReFT)
☆1,555Jan 14, 2026Updated last month
jbloomAus / SAEDashboard
View on GitHub
☆88Dec 18, 2025Updated last month
unslothai / unsloth
View on GitHub
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
☆51,922Updated this week
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆70,205Updated this week
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated last year
verl-project / verl
View on GitHub
verl: Volcano Engine Reinforcement Learning for LLMs
☆19,132Updated this week
TransformerLensOrg / CircuitsVis
View on GitHub
Mechanistic Interpretability Visualizations using React
☆320Dec 18, 2024Updated last year
confident-ai / deepeval
View on GitHub
The LLM Evaluation Framework
☆13,613Feb 10, 2026Updated last week
adamkarvonen / activation_oracles
View on GitHub
☆51Jan 28, 2026Updated 2 weeks ago
mem0ai / mem0
View on GitHub
Universal memory layer for AI Agents
☆47,230Feb 3, 2026Updated 2 weeks ago
run-llama / llama_index
View on GitHub
LlamaIndex is the leading framework for building LLM-powered agents over your data.
☆46,977Updated this week
huggingface / trl
View on GitHub
Train transformer language models with reinforcement learning.
☆17,360Updated this week
facebookresearch / lingua
View on GitHub
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
☆4,754Jul 18, 2025Updated 6 months ago
UFO-101 / auto-circuit
View on GitHub
A library for efficient patching and automatic circuit discovery.
☆90Dec 31, 2025Updated last month
huggingface / open-r1
View on GitHub
Fully open reproduction of DeepSeek-R1
☆25,879Nov 24, 2025Updated 2 months ago

safety-research / circuit-tracerView external linksLinks

Alternatives and similar repositories for circuit-tracer

safety-research / circuit-tracer
View external linksLinks