nostalgebraist / transformer-utilsLinks

Utilities for the HuggingFace transformers library

☆70

Alternatives and similar repositories for transformer-utils

Users that are interested in transformer-utils are comparing it to the libraries listed below

Sorting:

TomFrederik / unseal
Mechanistic Interpretability for Transformer Models
☆51Updated 3 years ago
EleutherAI / concept-erasure
Erasing concepts from neural representations with provable guarantees
☆231Updated 6 months ago
neelnanda-io / 1L-Sparse-Autoencoder
☆124Updated last year
TransformerLensOrg / CircuitsVis
Mechanistic Interpretability Visualizations using React
☆272Updated 7 months ago
guy-dar / embedding-space
☆54Updated 2 years ago
UFO-101 / auto-circuit
A library for efficient patching and automatic circuit discovery.
☆73Updated 2 weeks ago
redwoodresearch / Easy-Transformer
☆121Updated last year
callummcdougall / sae_vis
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆207Updated 7 months ago
mega002 / lm-debugger
The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.
☆178Updated 3 years ago
EleutherAI / elk
Keeping language models honest by directly eliciting knowledge encoded in their activations.
☆209Updated last week
ArthurConmy / Automatic-Circuit-Discovery
☆234Updated 10 months ago
collin-burns / discovering_latent_knowledge
☆274Updated last year
EleutherAI / elk-generalization
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…
☆28Updated last year
anthropics / toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆127Updated 2 years ago
KihoPark / LLM_Categorical_Hierarchical_Representations
☆104Updated 5 months ago
EleutherAI / knowledge-neurons
A library for finding knowledge neurons in pretrained transformer models.
☆158Updated 3 years ago
krandiash / quinine
A library to create and manage configuration files, especially for machine learning projects.
☆79Updated 3 years ago
AlignmentResearch / tuned-lens
Tools for understanding how transformer predictions are built layer-by-layer
☆512Updated last year
PAIR-code / interpretability
PAIR.withgoogle.com and friend's work on interpretability methods
☆195Updated 2 weeks ago
wesg52 / sparse-probing-paper
Sparse probing paper full code.
☆58Updated last year
likenneth / othello_world
Emergent world representations: Exploring a sequence model trained on a synthetic task
☆186Updated 2 years ago
saprmarks / geometry-of-truth
☆89Updated 11 months ago
EleutherAI / delphi
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆200Updated this week
ckkissane / crosscoder-model-diff-replication
Open source replication of Anthropic's Crosscoders for Model Diffing
☆57Updated 9 months ago
princeton-nlp / TransformerPrograms
[NeurIPS 2023] Learning Transformer Programs
☆162Updated last year
mega002 / ff-layers
The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…
☆94Updated 3 years ago
ApolloResearch / e2e_sae
Sparse Autoencoder Training Library
☆54Updated 3 months ago
justinchiu / openlogprobs
Extract full next-token probabilities via language model APIs
☆247Updated last year
adamkarvonen / SAEBench
☆109Updated 3 weeks ago
mishajw / repeng
Experiments with representation engineering
☆12Updated last year