A curated reading list of research in Sparse Autoencoders, Feature Extraction and related topics in Mechanistic Interpretability
☆32Jan 30, 2025Updated last year
Alternatives and similar repositories for awesome-sparse-autoencoders
Users that are interested in awesome-sparse-autoencoders are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆416Aug 21, 2025Updated 9 months ago
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆20Jan 19, 2025Updated last year
- Improving Steering Vectors by Targeting Sparse Autoencoder Features☆27Nov 20, 2024Updated last year
- ☆19Mar 5, 2024Updated 2 years ago
- REBUS: A Robust Evaluation Benchmark of Understanding Symbols☆13Aug 13, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆12Jul 8, 2024Updated last year
- The offical code for paper "What Constitutes a Faithful Summary? Preserving Author Perspectives in News Summarization"☆10Jun 23, 2024Updated last year
- ☆38Feb 18, 2026Updated 3 months ago
- A decompressor/converter for .ANI/.SLD animation files, as used in the Brunswick Frameworx bowling scoring system.☆16Feb 9, 2022Updated 4 years ago
- Materials for "Multi-property Steering of Large Language Models with Dynamic Activation Composition"☆14Nov 22, 2024Updated last year
- Variational Auto-Encoder implementation in Tensorflow☆10Jan 22, 2017Updated 9 years ago
- Official Release of NeurIPS 2024 paper "Slot State Space Models"☆11Mar 22, 2025Updated last year
- A benchmark for mechanistic discovery of circuits in Transformers☆17Dec 15, 2024Updated last year
- A collection of lightweight interpretability scripts to understand how LLMs think☆89Mar 18, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Attribution-based Parameter Decomposition☆34Jun 11, 2025Updated 11 months ago
- Reproduction Code for Paper "Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models"☆14Jun 1, 2024Updated last year
- ICLR 2025 Workshop & CHI 2025 SIG: "Bidirectional Human-AI Alignment"☆55Aug 6, 2024Updated last year
- Sparse Autoencoder Training Library☆57May 1, 2025Updated last year
- ☆28Oct 30, 2025Updated 7 months ago
- ☆22Sep 16, 2025Updated 8 months ago
- ☆22Apr 22, 2024Updated 2 years ago
- Every Eval Ever is a shared schema and crowdsourced eval database. It defines a standardized metadata format for storing AI evaluation re…☆67May 18, 2026Updated last week
- Pytorch project accompanying the paper "Comparing Deep Models and Evaluation Strategies for Multi-Pitch Estimation in Music Recordings", …☆13Aug 26, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆27Jun 29, 2025Updated 11 months ago
- James' cookbook of evaluations and finetuning experiments☆27Feb 19, 2026Updated 3 months ago
- Course webpage for MAT335 at the University of Toronto☆14Apr 3, 2020Updated 6 years ago
- ☆30Sep 19, 2025Updated 8 months ago
- Supporting code for "Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration".☆13Jun 18, 2022Updated 3 years ago
- SimX-OR: Extending Any Simulation Benchmark to Evaluate the Observational Robustness of VLA Models☆33Nov 4, 2025Updated 6 months ago
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆259Feb 27, 2026Updated 3 months ago
- This is the official implementation for paper "On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond".☆22Nov 17, 2025Updated 6 months ago
- [EMNLP 2023] Knowledge Rumination for Pre-trained Language Models☆17Jun 29, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The code for paper Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models.☆13Apr 10, 2024Updated 2 years ago
- A digital garden of my notes and thoughts around all sorts of areas, but primarily technology☆17Updated this week
- Redwood Research's transformer interpretability tools☆15Apr 15, 2022Updated 4 years ago
- Code for experiments on transformers using Markovian data.☆22Nov 22, 2024Updated last year
- ☆13Apr 10, 2025Updated last year
- Test-time adaptation via Nearest neighbor information (TAST), submitted to ICLR'23☆24Jul 11, 2023Updated 2 years ago
- Python code used to analyze and process symbolic drum patterns☆14May 8, 2023Updated 3 years ago