Prisma-Multimodal / ViT-PrismaLinks

ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).

☆315

Alternatives and similar repositories for ViT-Prisma

Users that are interested in ViT-Prisma are comparing it to the libraries listed below

Sorting:

saprmarks / dictionary_learning
☆354Updated 2 months ago
EleutherAI / sparsify
Sparsify transformers with SAEs and transcoders
☆640Updated last week
ai-safety-foundation / sparse_autoencoder
Sparse Autoencoder for Mechanistic Interpretability
☆272Updated last year
callummcdougall / sae_vis
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆222Updated 10 months ago
ndif-team / nnsight
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆683Updated last week
openai / sparse_autoencoder
☆530Updated last year
ArthurConmy / Automatic-Circuit-Discovery
☆247Updated last year
TransformerLensOrg / CircuitsVis
Mechanistic Interpretability Visualizations using React
☆293Updated 10 months ago
neelnanda-io / 1L-Sparse-Autoencoder
☆128Updated last year
EleutherAI / delphi
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆218Updated last week
adamkarvonen / SAEBench
☆131Updated this week
HoagyC / sparse_coding
Using sparse coding to find distributed representations used by neural networks.
☆279Updated last year
jbloomAus / SAELens
Training Sparse Autoencoders on Language Models
☆1,001Updated this week
rbalestr-lab / stable-pretraining
Reliable, minimal and scalable library for pretraining foundation and world models
☆71Updated this week
KihoPark / linear_rep_geometry
☆107Updated 8 months ago
minyoungg / platonic-rep
☆615Updated 6 months ago
AlignmentResearch / tuned-lens
Tools for understanding how transformer predictions are built layer-by-layer
☆532Updated 2 months ago
saprmarks / feature-circuits
☆190Updated last week
apartresearch / interpretability-starter
🧠 Starter templates for doing interpretability research
☆75Updated 2 years ago
neelnanda-io / Crosscoders
☆54Updated 11 months ago
collin-burns / discovering_latent_knowledge
☆279Updated last year
ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models
This repository collects all relevant resources about interpretability in LLMs
☆375Updated 11 months ago
KempnerInstitute / overcomplete
👋 Overcomplete is a Vision-based SAE Toolbox
☆93Updated 2 months ago
jacobdunefsky / transcoder_circuits
☆179Updated 11 months ago
mlfoundations / task_vectors
Editing Models with Task Arithmetic
☆508Updated last year
anthropics / toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆129Updated 3 years ago
bartbussmann / matryoshka_sae
☆48Updated 9 months ago
ApolloResearch / e2e_sae
Sparse Autoencoder Training Library
☆55Updated 5 months ago
steering-vectors / steering-vectors
Steering vectors for transformer language models in Pytorch / Huggingface
☆125Updated 8 months ago
ARBORproject / arborproject.github.io
☆81Updated 7 months ago