ajobi-uhc / seerView external linksLinks
This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix some of the annoying things you get from only using Claude code out of the box
☆124Updated this week
Alternatives and similar repositories for seer
Users that are interested in seer are comparing it to the libraries listed below
Sorting:
- Attribution-based Parameter Decomposition☆33Jun 11, 2025Updated 8 months ago
- ☆17Aug 30, 2025Updated 5 months ago
- Unified access to Large Language Model modules using NNsight☆88Feb 6, 2026Updated last week
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆20Jan 19, 2025Updated last year
- Make open-weight LLM agents play the game "Among Us", and study how the models learn and express lying and deception in the game.☆23Dec 17, 2025Updated last month
- A curated reading list of research in Sparse Autoencoders, Feature Extraction and related topics in Mechanistic Interpretability☆30Jan 30, 2025Updated last year
- Course Materials for Interpretability of Large Language Models (0368.4264) at Tel Aviv University☆297Updated this week
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- A tiny easily hackable implementation of a feature dashboard.☆15Oct 21, 2025Updated 3 months ago
- A benchmark for mechanistic discovery of circuits in Transformers☆16Dec 15, 2024Updated last year
- A TinyStories LM with SAEs and transcoders☆14Apr 3, 2025Updated 10 months ago
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆811Updated this week
- Engine for collecting, uploading, and downloading model activations☆26Apr 2, 2025Updated 10 months ago
- ☆71Updated this week
- ☆389Aug 21, 2025Updated 5 months ago
- A library for efficient patching and automatic circuit discovery.☆88Dec 31, 2025Updated last month
- ☆17Updated this week
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year
- Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"☆26Jun 4, 2024Updated last year
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆83Nov 27, 2024Updated last year
- ☆51Jan 20, 2026Updated 3 weeks ago
- ☆25Nov 11, 2025Updated 3 months ago
- Mapping out the "memory" of neural nets with data attribution☆39Feb 3, 2026Updated last week
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆240Dec 16, 2024Updated last year
- Training Sparse Autoencoders on Language Models☆1,201Updated this week
- Efficiently computing & storing token n-grams from large corpora☆26Oct 6, 2024Updated last year
- open source interpretability platform 🧠☆704Updated this week
- ⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.☆111Oct 27, 2025Updated 3 months ago
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆34Oct 28, 2025Updated 3 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆191Apr 17, 2025Updated 9 months ago
- ☆28Jan 12, 2026Updated last month
- Inference API for many LLMs and other useful tools for empirical research☆104Feb 6, 2026Updated last week
- ☆24Oct 2, 2025Updated 4 months ago
- Decoder only transformer, built from scratch with PyTorch☆32Oct 22, 2023Updated 2 years ago
- Prompts used in the Automated Auditing Blog Post☆138Jul 24, 2025Updated 6 months ago
- A library for mechanistic interpretability of GPT-style language models☆3,073Updated this week
- ☆35Sep 13, 2023Updated 2 years ago
- Open Source Replication of Anthropic's Alignment Faking Paper☆54Apr 4, 2025Updated 10 months ago
- Project exploring 3D volumetric rendering of NEXRAD radar data.☆11Oct 23, 2023Updated 2 years ago