neelnanda-io / NeuroscopeLinks

Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons

☆12

Alternatives and similar repositories for Neuroscope

Users that are interested in Neuroscope are comparing it to the libraries listed below

Sorting:

hijohnnylin / neuronpedia-scorer
☆17Updated last year
neelnanda-io / Grokking
A Mechanistic Interpretability Analysis of Grokking
☆21Updated 2 years ago
anthropics / sycophancy-to-subterfuge-paper
☆23Updated 10 months ago
bilal-chughtai / rep-theory-mech-interp
☆26Updated 2 years ago
gpoesia / certified-reasoning
Certified Reasoning with Language Models
☆31Updated last year
anthropics / toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆127Updated 2 years ago
oughtinc / primer
Factored Cognition Primer: How to write compositional language model programs
☆49Updated 2 years ago
nickkeesG / Pantheon
Experimental LLM interface exploring new ways to use AI to improve human thinking
☆18Updated 4 months ago
Mech-Interp / PySvelte
A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations
☆14Updated last year
JasonGross / guarantees-based-mechanistic-interpretability
☆14Updated 2 weeks ago
Chillee / lit-llama
Simple (fast) transformer inference in PyTorch with torch.compile + lit-llama code
☆11Updated last year
EleutherAI / elk-generalization
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…
☆28Updated last year
google-deepmind / mishax
☆134Updated 3 months ago
R0bk / Transpector
Visual Transformer Mechanistic Analysis Tool
☆34Updated 2 years ago
google-deepmind / dangerous-capability-evaluations
☆55Updated 9 months ago
HumanCompatibleAI / leela-interp
Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
☆24Updated last year
moirage / alignment-research-dataset
A dataset of alignment research and code to reproduce it
☆77Updated 2 years ago
ndif-team / ndif
The NDIF server, which performs deep inference and serves nnsight requests remotely
☆32Updated this week
adamimos / epsilon-transformers
epsilon machines and transformers!
☆25Updated last week
noanabeshima / tinymodel
A TinyStories LM with SAEs and transcoders
☆12Updated 3 months ago
neoneye / ARC-Interactive
Enjoy puzzle-solving directly in your browser.
☆28Updated 2 months ago
amack315 / unsupervised-steering-vectors
☆32Updated last year
KihoPark / LLM_Categorical_Hierarchical_Representations
☆101Updated 5 months ago
jbloomAus / SAEDashboard
☆60Updated this week
JD-P / RetroInstruct
Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.
☆32Updated 4 months ago
METR / task-template
☆9Updated 11 months ago
hijohnnylin / automated-interpretability
☆10Updated last month
callummcdougall / sae_visualizer
☆28Updated last year
harmonic-ai / datasets
Harmonic Datasets
☆40Updated last year
METR / vivaria
Vivaria is METR's tool for running evaluations and conducting agent elicitation research.
☆99Updated 2 weeks ago