ajobi-uhc / seerView external linksLinks
This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix some of the annoying things you get from only using Claude code out of the box
☆124Updated this week
Alternatives and similar repositories for seer
Users that are interested in seer are comparing it to the libraries listed below
Sorting:
- ☆17Aug 30, 2025Updated 5 months ago
- Unified access to Large Language Model modules using NNsight☆88Feb 6, 2026Updated last week
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆20Jan 19, 2025Updated last year
- Make open-weight LLM agents play the game "Among Us", and study how the models learn and express lying and deception in the game.☆24Dec 17, 2025Updated last month
- A curated reading list of research in Sparse Autoencoders, Feature Extraction and related topics in Mechanistic Interpretability☆30Jan 30, 2025Updated last year
- Course Materials for Interpretability of Large Language Models (0368.4264) at Tel Aviv University☆297Updated this week
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- ☆146Dec 30, 2025Updated last month
- A benchmark for mechanistic discovery of circuits in Transformers☆16Dec 15, 2024Updated last year
- A TinyStories LM with SAEs and transcoders☆14Apr 3, 2025Updated 10 months ago
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆811Updated this week
- Engine for collecting, uploading, and downloading model activations☆26Apr 2, 2025Updated 10 months ago
- ☆71Updated this week
- ☆389Aug 21, 2025Updated 5 months ago
- A library for efficient patching and automatic circuit discovery.☆88Dec 31, 2025Updated last month
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year
- ☆17Updated this week
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆83Nov 27, 2024Updated last year
- Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"☆26Jun 4, 2024Updated last year
- Landing page for MIB: A Mechanistic Interpretability Benchmark☆24Aug 15, 2025Updated 6 months ago
- ☆51Jan 20, 2026Updated 3 weeks ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆241Updated this week
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆240Dec 16, 2024Updated last year
- Training Sparse Autoencoders on Language Models☆1,201Updated this week
- Efficiently computing & storing token n-grams from large corpora☆26Oct 6, 2024Updated last year
- open source interpretability platform 🧠☆704Updated this week
- ⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.☆111Oct 27, 2025Updated 3 months ago
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆34Oct 28, 2025Updated 3 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆191Apr 17, 2025Updated 9 months ago
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.☆238Aug 11, 2025Updated 6 months ago
- ☆28Jan 12, 2026Updated last month
- ☆929Feb 4, 2026Updated last week
- Inference API for many LLMs and other useful tools for empirical research☆104Feb 6, 2026Updated last week
- Auditing agents for fine-tuning safety☆18Oct 21, 2025Updated 3 months ago
- ☆24Oct 2, 2025Updated 4 months ago
- Decoder only transformer, built from scratch with PyTorch☆32Oct 22, 2023Updated 2 years ago
- Prompts used in the Automated Auditing Blog Post☆138Jul 24, 2025Updated 6 months ago
- A library for mechanistic interpretability of GPT-style language models☆3,073Updated this week
- ☆35Sep 13, 2023Updated 2 years ago