☆2,618Updated this week
Alternatives and similar repositories for circuit-tracer
Users that are interested in circuit-tracer are comparing it to the libraries listed below
Sorting:
- open source interpretability platform 🧠☆729Updated this week
- A library for mechanistic interpretability of GPT-style language models☆3,112Updated this week
- Training Sparse Autoencoders on Language Models☆1,219Updated this week
- Sparsify transformers with SAEs and transcoders☆696Updated this week
- A tiny easily hackable implementation of a feature dashboard.☆15Oct 21, 2025Updated 4 months ago
- ☆199Nov 17, 2024Updated last year
- DSPy: The framework for programming—not prompting—language models☆32,381Updated this week
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆825Updated this week
- Open source interpretability artefacts for R1.☆171Apr 21, 2025Updated 10 months ago
- Attribution-based Parameter Decomposition☆33Jun 11, 2025Updated 8 months ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆243Updated this week
- ☆150Dec 30, 2025Updated 2 months ago
- Tools for merging pretrained large language models.☆6,814Jan 26, 2026Updated last month
- https://transformer-circuits.pub/2025/attribution-graphs/methods.html☆91Mar 27, 2025Updated 11 months ago
- Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".☆351Jun 13, 2025Updated 8 months ago
- ☆395Aug 21, 2025Updated 6 months ago
- ☆571Jul 19, 2024Updated last year
- ☆4,110Jun 4, 2024Updated last year
- Minimal reproduction of DeepSeek R1-Zero☆12,767Apr 24, 2025Updated 10 months ago
- Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.☆21,026Mar 11, 2025Updated 11 months ago
- Our library for RL environments + evals☆3,850Updated this week
- SGLang is a high-performance serving framework for large language models and multimodal models.☆23,658Updated this week
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆245Dec 16, 2024Updated last year
- ☆209Oct 14, 2025Updated 4 months ago
- Stanford NLP Python library for Representation Finetuning (ReFT)☆1,558Jan 14, 2026Updated last month
- ☆89Dec 18, 2025Updated 2 months ago
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.☆52,724Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆71,234Updated this week
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- verl: Volcano Engine Reinforcement Learning for LLMs☆19,339Updated this week
- Mechanistic Interpretability Visualizations using React☆326Dec 18, 2024Updated last year
- The LLM Evaluation Framework☆13,787Updated this week
- ☆55Jan 28, 2026Updated 3 weeks ago
- Universal memory layer for AI Agents☆47,994Updated this week
- LlamaIndex is the leading document agent and OCR platform☆47,210Updated this week
- Train transformer language models with reinforcement learning.☆17,460Updated this week
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.☆4,752Jul 18, 2025Updated 7 months ago
- A library for efficient patching and automatic circuit discovery.☆90Dec 31, 2025Updated 2 months ago
- A framework for few-shot evaluation of language models.☆11,478Feb 15, 2026Updated last week