IParraMartin / Sparse-AutoencoderLinks
A PyTorch implementation of a Sparse Auto Encoder (SAE) using MSE loss and KL Divergence penalty
☆27Updated last year
Alternatives and similar repositories for Sparse-Autoencoder
Users that are interested in Sparse-Autoencoder are comparing it to the libraries listed below
Sorting:
- Sparse Autoencoder for Mechanistic Interpretability☆290Updated last year
- Conference schedule, top papers, and analysis of the data for NeurIPS 2023!☆120Updated 2 years ago
- PyTorch library for Active Fine-Tuning☆96Updated 4 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆102Updated 3 months ago
- ⏰ AI conference deadline countdowns☆320Updated last week
- [NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models" 🐍☆45Updated last year
- 🪄 Interpreto is an interpretability toolbox for LLMs☆139Updated last week
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆181Updated 6 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆34Updated 2 years ago
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).☆337Updated 6 months ago
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆40Updated last year
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆238Updated last year
- ☆388Updated 5 months ago
- Sparse Autoencoder Training Library☆56Updated 9 months ago
- Code repository for Black Mamba☆261Updated last year
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆121Updated last year
- This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.☆106Updated last year
- Steering vectors for transformer language models in Pytorch / Huggingface☆140Updated 11 months ago
- A toolkit for quantitative evaluation of data attribution methods.☆55Updated 6 months ago
- Layer-wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]☆219Updated 6 months ago
- ☆58Updated last year
- ☆58Updated last year
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆75Updated 7 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆205Updated 3 weeks ago
- Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)☆60Updated 6 months ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆213Updated last week
- Decomposing and Editing Predictions by Modeling Model Computation☆139Updated last year
- ☆143Updated last month
- Attribution-based Parameter Decomposition☆33Updated 7 months ago
- Sparsify transformers with SAEs and transcoders☆688Updated last week