hijohnnylin/neuronpedia-scorer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hijohnnylin/neuronpedia-scorer)

hijohnnylin / neuronpedia-scorer

☆17

Alternatives and similar repositories for neuronpedia-scorer

Users that are interested in neuronpedia-scorer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

neelnanda-io / Neuroscope
View on GitHub
Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons
☆15Feb 13, 2023Updated 3 years ago
decoderesearch / automated-interpretability
View on GitHub
☆24Feb 13, 2026Updated 5 months ago
tilde-research / sieve
View on GitHub
Applying SAEs for fine-grained control
☆27Dec 15, 2024Updated last year
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆268Feb 27, 2026Updated 5 months ago
alexjfoote / Neuron2Graph
View on GitHub
Tools for exploring Transformer neuron behaviour, including input pruning and diversification.
☆10Jun 6, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ApolloResearch / e2e_sae
View on GitHub
Sparse Autoencoder Training Library
☆58May 1, 2025Updated last year
Butanium / tiny-activation-dashboard
View on GitHub
A tiny easily hackable implementation of a feature dashboard.
☆17Oct 21, 2025Updated 9 months ago
koayon / atp_star
View on GitHub
PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)
☆20Jan 19, 2025Updated last year
dmadras / predict-responsibly
View on GitHub
☆15Mar 28, 2024Updated 2 years ago
JasonGross / guarantees-based-mechanistic-interpretability
View on GitHub
☆18Jul 21, 2026Updated last week
blei-lab / circuitry
View on GitHub
☆16Oct 30, 2024Updated last year
PrasannS / rlhf-length-biases
View on GitHub
☆27Mar 13, 2024Updated 2 years ago
neelnanda-io / Grokking
View on GitHub
A Mechanistic Interpretability Analysis of Grokking
☆29Sep 26, 2022Updated 3 years ago
bartbussmann / BatchTopK
View on GitHub
Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)
☆67Jul 24, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
slavachalnev / SAE-TS
View on GitHub
Improving Steering Vectors by Targeting Sparse Autoencoder Features
☆29Nov 20, 2024Updated last year
adamkarvonen / dictionary_learning_demo
View on GitHub
☆26Aug 23, 2025Updated 11 months ago
vedantpalit / Towards-Vision-Language-Mechanistic-Interpretability
View on GitHub
This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…
☆25Feb 16, 2026Updated 5 months ago
HugoFry / mats_sae_training_for_ViTs
View on GitHub
☆25Apr 23, 2024Updated 2 years ago
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆269Updated this week
ckkissane / crosscoder-model-diff-replication
View on GitHub
Open source replication of Anthropic's Crosscoders for Model Diffing
☆68Oct 27, 2024Updated last year
YanniKouloumbis / next-js-window-ai
View on GitHub
A Next.js chatbot app demonstrating seamless integration with window.ai.
☆15Jun 25, 2023Updated 3 years ago
LoryPack / LLM-LieDetector
View on GitHub
Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"
☆74Jun 19, 2024Updated 2 years ago
HumanCompatibleAI / leela-interp
View on GitHub
Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
☆31Jun 4, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
thestephencasper / everything-you-need
View on GitHub
we got you bro
☆38Jul 29, 2024Updated last year
montemac / activation_additions
View on GitHub
Algebraic value editing in pretrained language models
☆71Nov 1, 2023Updated 2 years ago
noanabeshima / matryoshka-saes
View on GitHub
☆33Nov 28, 2024Updated last year
Trustworthy-ML-Lab / Describe-and-Dissect
View on GitHub
[TMLR 25] An automated method for explaining complex neuron behaviors in deep vision models using large language models
☆11Feb 20, 2025Updated last year
neelnanda-io / Crosscoders
View on GitHub
☆60Nov 19, 2024Updated last year
UFO-101 / auto-circuit
View on GitHub
A library for efficient patching and automatic circuit discovery.
☆99Dec 31, 2025Updated 6 months ago
not-lain / pxia
View on GitHub
minimalistic AI library that resembles HF's transformers
☆13Dec 31, 2024Updated last year
ENvironmentSet / overcurried
View on GitHub
Personal blog of ENvironmentSet based on oversomething.
☆14Oct 29, 2024Updated last year
fiveai / understanding_safety_finetuning
View on GitHub
Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)
☆12Oct 31, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
danielway / nexrad-volumetric-renderer
View on GitHub
Project exploring 3D volumetric rendering of NEXRAD radar data.
☆13Oct 23, 2023Updated 2 years ago
adamkarvonen / SAEBench
View on GitHub
☆179May 1, 2026Updated 2 months ago
quantified-uncertainty / metaforecast
View on GitHub
Fetch forecasts from prediction markets/forecasting platforms to make them searchable. Integrate these forecasts into other services.
☆68Feb 9, 2025Updated last year
neelnanda-io / 1L-Sparse-Autoencoder
View on GitHub
☆141Oct 28, 2023Updated 2 years ago
neelnanda-io / neel-plotly
View on GitHub
A very hacky set of functions for getting plotly to do what I want when doing mech interp research, designed to be compatible with PyTorc…
☆15Jun 16, 2023Updated 3 years ago
dmhyun / MSRP
View on GitHub
Official repository of Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization [EMNLP'22 …
☆10May 20, 2023Updated 3 years ago
DavideBuffelli / SizeShiftReg
View on GitHub
Code for the paper "SizeShiftReg: a Regularization Method for Improving Size-Generalization in Graph Neural Networks"
☆12Jan 17, 2023Updated 3 years ago