TomFrederik/unseal

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TomFrederik/unseal)

TomFrederik / unseal

Mechanistic Interpretability for Transformer Models

☆53

Alternatives and similar repositories for unseal

Users that are interested in unseal are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

redwoodresearch / interp
View on GitHub
Redwood Research's transformer interpretability tools
☆15Apr 15, 2022Updated 4 years ago
anthropics / PySvelte
View on GitHub
A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations
☆228Dec 22, 2021Updated 4 years ago
jbloomAus / DecisionTransformerInterpretability
View on GitHub
Interpreting how transformers simulate agents performing RL tasks
☆90Oct 23, 2023Updated 2 years ago
guy-dar / embedding-space
View on GitHub
☆58Jun 15, 2023Updated 3 years ago
neelnanda-io / Neuroscope
View on GitHub
Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons
☆15Feb 13, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
nostalgebraist / transformer-utils
View on GitHub
Utilities for the HuggingFace transformers library
☆77Jan 21, 2023Updated 3 years ago
JacobPfau / procgenAISC
View on GitHub
☆20Jan 21, 2023Updated 3 years ago
saprmarks / feature-circuits
View on GitHub
☆223Oct 14, 2025Updated 9 months ago
wesg52 / sparse-probing-paper
View on GitHub
Sparse probing paper full code.
☆68Dec 17, 2023Updated 2 years ago
vzhong / silg
View on GitHub
☆20Jan 14, 2022Updated 4 years ago
TomFrederik / grokking
View on GitHub
Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'
☆38Dec 4, 2021Updated 4 years ago
koayon / atp_star
View on GitHub
PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)
☆20Jan 19, 2025Updated last year
bilal-chughtai / rep-theory-mech-interp
View on GitHub
☆31May 4, 2023Updated 3 years ago
montemac / activation_additions
View on GitHub
Algebraic value editing in pretrained language models
☆71Nov 1, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
redwoodresearch / remix_public
View on GitHub
☆20Feb 17, 2023Updated 3 years ago
understanding-search / maze-transformer
View on GitHub
This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.
☆35Oct 28, 2025Updated 9 months ago
AlignmentResearch / tuned-lens
View on GitHub
Tools for understanding how transformer predictions are built layer-by-layer
☆605Aug 7, 2025Updated 11 months ago
Nix07 / belief_tracking
View on GitHub
This repository contains the code used for the experiments in the paper "Language Models use Lookbacks to Track Beliefs".
☆16Mar 14, 2026Updated 4 months ago
understanding-search / structured-representations-maze-transformers
View on GitHub
see github.com/understanding-search/maze-transformer
☆10Dec 8, 2023Updated 2 years ago
MaheepChaudhary / SAE-Ravel
View on GitHub
Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…
☆13Jan 26, 2025Updated last year
google-deepmind / cartesian-frames
View on GitHub
A formalisation of Cartesian Frames, a perspective on embedded agency, in the HOL theorem prover.
☆22Dec 20, 2021Updated 4 years ago
FL33TW00D / wgpu-bench
View on GitHub
☆12Jun 27, 2024Updated 2 years ago
fdalvi / NeuroX
View on GitHub
A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.
☆110Oct 4, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
redwoodresearch / mlab
View on GitHub
Machine Learning for Alignment Bootcamp
☆84Apr 27, 2022Updated 4 years ago
moirage / alignment-research-dataset
View on GitHub
A dataset of alignment research and code to reproduce it
☆80Jun 22, 2023Updated 3 years ago
quantified-uncertainty / ai-safety-papers
View on GitHub
☆22Sep 9, 2021Updated 4 years ago
callummcdougall / ARENA_2.0
View on GitHub
Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.
☆247Aug 11, 2025Updated 11 months ago
wesg52 / universal-neurons
View on GitHub
Universal Neurons in GPT2 Language Models
☆30May 28, 2024Updated 2 years ago
kdu4108 / semiring-backprop-exps
View on GitHub
☆16Jul 10, 2023Updated 3 years ago
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆268Feb 27, 2026Updated 5 months ago
quantified-uncertainty / squiggle
View on GitHub
This monorepo covers multiple QURI projects, including Squiggle language, Squiggle Hub and Metaforecast
☆217Updated this week
akandykeller / TopographicVAE
View on GitHub
Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"
☆81Mar 4, 2022Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
jenni-ai / T2FW
View on GitHub
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆20Oct 9, 2022Updated 3 years ago
belindal / state-probes
View on GitHub
Code for the paper "Implicit Representations of Meaning in Neural Language Models"
☆57Feb 14, 2023Updated 3 years ago
photogeniq / image-encoders
View on GitHub
🖼️📊
☆11Jun 9, 2020Updated 6 years ago
google-deepmind / tracr
View on GitHub
☆569Feb 5, 2024Updated 2 years ago
Confirm-Solutions / dreamy
View on GitHub
Fluent dreaming for language models
☆13Jul 22, 2024Updated 2 years ago
StampyAI / stampy
View on GitHub
A Discord bot for the Robert Miles AI server
☆41Jan 27, 2026Updated 6 months ago
Cadenza-Labs / sleeper-agents
View on GitHub
☆15Jul 12, 2024Updated 2 years ago