slavachalnev / SAE-TS
Improving Steering Vectors by Targeting Sparse Autoencoder Features
☆18Updated 5 months ago
Alternatives and similar repositories for SAE-TS:
Users that are interested in SAE-TS are comparing it to the libraries listed below
- Open source replication of Anthropic's Crosscoders for Model Diffing☆54Updated 6 months ago
- ☆39Updated 5 months ago
- A library for efficient patching and automatic circuit discovery.☆64Updated 2 weeks ago
- ☆93Updated 3 weeks ago
- Applying SAEs for fine-grained control☆17Updated 4 months ago
- ☆10Updated 9 months ago
- Sparse Autoencoder Training Library☆49Updated this week