Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆153Sep 14, 2022Updated 3 years ago
Alternatives and similar repositories for toy-models-of-superposition
Users that are interested in toy-models-of-superposition are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆14Feb 13, 2023Updated 3 years ago
- Hypercorn is an ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.☆19Jan 12, 2026Updated 5 months ago
- ☆27Sep 5, 2024Updated last year
- ☆43Feb 11, 2025Updated last year
- ☆30May 4, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Mechanistic Interpretability Visualizations using React☆352Apr 30, 2026Updated last month
- James' cookbook of evaluations and finetuning experiments☆28Feb 19, 2026Updated 4 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆67Oct 27, 2024Updated last year
- ☆67Feb 16, 2023Updated 3 years ago
- Using sparse coding to find distributed representations used by neural networks.☆306Nov 10, 2023Updated 2 years ago
- Applying SAEs for fine-grained control☆27Dec 15, 2024Updated last year
- ☆12Oct 23, 2022Updated 3 years ago
- A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations☆223Dec 22, 2021Updated 4 years ago
- ☆136Oct 28, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆22Feb 13, 2026Updated 4 months ago
- ☆26Jun 22, 2025Updated 11 months ago
- Code associated to papers on superposition (in ML interpretability)☆40Sep 13, 2022Updated 3 years ago
- Mamba support for transformer lens☆20Sep 17, 2024Updated last year
- ☆109Aug 8, 2024Updated last year
- Engine for collecting, uploading, and downloading model activations☆29Apr 2, 2025Updated last year
- ☆422Aug 21, 2025Updated 9 months ago
- ☆27Oct 6, 2024Updated last year
- A library for mechanistic interpretability of GPT-style language models☆3,553Jun 11, 2026Updated last week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆59Nov 19, 2024Updated last year
- Training Sparse Autoencoders on Language Models☆1,419Jun 9, 2026Updated last week
- NeurIPS22 "RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection" and T-PAMI Extension☆20Feb 21, 2025Updated last year
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- ☆1,082Mar 6, 2024Updated 2 years ago
- ☆20Nov 15, 2024Updated last year
- ☆289Oct 1, 2024Updated last year
- LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces☆103Sep 21, 2023Updated 2 years ago
- Sparse probing paper full code.☆68Dec 17, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Improving Steering Vectors by Targeting Sparse Autoencoder Features☆27Nov 20, 2024Updated last year
- ☆12Mar 7, 2024Updated 2 years ago
- Sparse Autoencoder for Mechanistic Interpretability☆302Jul 20, 2024Updated last year
- ☆396Jul 2, 2024Updated last year
- A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations☆17Apr 15, 2024Updated 2 years ago
- Sparse Autoencoder Training Library☆57May 1, 2025Updated last year
- ☆13Aug 19, 2024Updated last year