PaulPauls / llama3_interpretability_saeLinks

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

☆620

Alternatives and similar repositories for llama3_interpretability_sae

Users that are interested in llama3_interpretability_sae are comparing it to the libraries listed below

Sorting:

valine / NeuralFlow
Visualize the intermediate output of Mistral 7B
☆367Updated 6 months ago
labmlai / inspectus
LLM Analytics
☆674Updated 9 months ago
rentruewang / bocoel
Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…
☆286Updated last month
vgel / repeng
A library for making RepE control vectors
☆618Updated 6 months ago
felafax / felafax
Felafax is building AI infra for non-NVIDIA GPUs
☆567Updated 6 months ago
google-deepmind / recurrentgemma
Open weights language model from Google DeepMind, based on Griffin.
☆645Updated last month
facebookresearch / searchformer
Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".
☆372Updated last year
mlecauchois / micrograd-cuda
☆249Updated last year
Tsadoq / ErisForge
Dead Simple LLM Abliteration
☆228Updated 5 months ago
lechmazur / elimination_game
A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private co…
☆283Updated 2 weeks ago
adamkarvonen / chess_llm_interpretability
Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …
☆208Updated 8 months ago
joennlae / tensorli
Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).
☆253Updated last year
therealoliver / Deepdive-llama3-from-scratch
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
☆601Updated 5 months ago
ScalingIntelligence / tokasaurus
☆383Updated 2 weeks ago
em-llm / EM-LLM-model
☆220Updated 4 months ago
okuvshynov / slowllama
Finetune llama2-70b and codellama on MacBook Air without quantization
☆448Updated last year
google-deepmind / treescope
An interactive HTML pretty-printer for machine learning research in IPython notebooks.
☆426Updated 3 months ago
Cerebras / gigaGPT
a small code base for training large models
☆307Updated 3 months ago
enjalot / latent-scope
A scientific instrument for investigating latent spaces
☆717Updated 2 months ago
neurallambda / awesome-reasoning
a curated list of data for reasoning ai
☆137Updated 11 months ago
idoh / mamba.np
A pure NumPy implementation of Mamba.
☆223Updated last year
FlorianDietz / comgra
A tool to analyze and debug neural networks in pytorch. Use a GUI to traverse the computation graph and view the data from many different…
☆288Updated 7 months ago
EGjoni / DRUGS
Stop messing around with finicky sampling parameters and just use DRµGS!
☆350Updated last year
andyk / recursive_llm
Implement recursion using English as the programming language and an LLM as the runtime.
☆239Updated 2 years ago
valine / training-hot-swap
Pytorch script hot swap: Change code without unloading your LLM from VRAM
☆126Updated 3 months ago
kolinko / effort
An implementation of bucketMul LLM inference
☆221Updated last year
SakanaAI / evo-memory
Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
☆316Updated 9 months ago
imelnyk / ArxivPapers
Code behind Arxiv Papers
☆526Updated last year
google-deepmind / penzai
A JAX research toolkit for building, editing, and visualizing neural networks.
☆1,805Updated last month
bananaml / fructose
☆745Updated last year