soniajoseph / ViT-Prisma
ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).
☆204Updated this week
Alternatives and similar repositories for ViT-Prisma:
Users that are interested in ViT-Prisma are comparing it to the libraries listed below
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆182Updated 2 months ago
- ☆243Updated last week
- Sparse Autoencoder for Mechanistic Interpretability☆216Updated 7 months ago
- ☆116Updated last year
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆74Updated 6 months ago
- ☆203Updated 4 months ago
- Using sparse coding to find distributed representations used by neural networks.☆213Updated last year
- ☆86Updated last week
- Sparsify transformers with SAEs and transcoders☆461Updated this week
- ☆151Updated this week
- Mechanistic Interpretability Visualizations using React☆232Updated 2 months ago
- Training Sparse Autoencoders on Language Models☆619Updated this week
- ☆55Updated 3 months ago
- ☆142Updated 3 weeks ago
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆490Updated this week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆95Updated 3 months ago
- ☆421Updated 7 months ago
- Tools for understanding how transformer predictions are built layer-by-layer☆475Updated 8 months ago
- WIP☆93Updated 6 months ago
- ☆52Updated this week
- Tools for studying developmental interpretability in neural networks.☆84Updated 3 weeks ago
- Steering vectors for transformer language models in Pytorch / Huggingface☆88Updated this week
- Sparse and discrete interpretability tool for neural networks☆58Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆112Updated 2 years ago