science-of-finetuning / crosscoder_learningView external linksLinks
Modified to support crosscoder training.
☆25Feb 4, 2026Updated last week
Alternatives and similar repositories for crosscoder_learning
Users that are interested in crosscoder_learning are comparing it to the libraries listed below
Sorting:
- Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.☆16Nov 21, 2025Updated 2 months ago
- ☆16Jul 9, 2025Updated 7 months ago
- A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.☆57Updated this week
- Applying SAEs for fine-grained control☆25Dec 15, 2024Updated last year
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆241Updated this week
- Trains Sparse Autoencoders based on outputs from language models☆11Oct 7, 2024Updated last year
- ☆58Nov 19, 2024Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 10 months ago
- ☆25Nov 28, 2024Updated last year
- ACRE: Abstract Causal REasoning Beyond Covariation☆19Dec 7, 2021Updated 4 years ago
- Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)☆60Jul 24, 2025Updated 6 months ago
- ☆20Apr 10, 2025Updated 10 months ago
- Unified access to Large Language Model modules using NNsight☆88Feb 6, 2026Updated last week
- ☆23Aug 23, 2025Updated 5 months ago
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆811Updated this week
- Sparse Autoencoder Training Library☆56May 1, 2025Updated 9 months ago
- ☆30Dec 2, 2024Updated last year
- ☆47May 27, 2025Updated 8 months ago
- ☆132Oct 28, 2023Updated 2 years ago
- ☆83Feb 25, 2025Updated 11 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆83Nov 27, 2024Updated last year
- ☆71Updated this week
- Open source interpretability artefacts for R1.☆170Apr 21, 2025Updated 9 months ago
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- Sample implementation accompanying the NeurIPS 2019 paper 'Powerset Convolutional Neural Networks' by Chris Wendler, Dan Alistarh, and Ma…☆10Oct 26, 2020Updated 5 years ago
- [ICLR 2024 Spotlight] Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Communi…☆11Mar 29, 2024Updated last year
- Residual Quantization Autoencoder, used for interpreting LLMs☆14Jan 1, 2025Updated last year
- ☆10Mar 9, 2025Updated 11 months ago
- Create string diagrams with LaTeX!☆14Jan 3, 2025Updated last year
- Training Sparse Autoencoders on Language Models☆1,201Updated this week
- All source code and materials for the AMIA 2015 Tutorial on Using R for Healthcare Data Science co-taught by Vojtech Huser and Laura Wile…☆12Nov 15, 2018Updated 7 years ago
- Menagerie of video models trained on various video datasets☆10Oct 13, 2024Updated last year
- The Compositionality article class.☆13Jun 12, 2025Updated 8 months ago
- Code for the paper "Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages" (N…☆16Apr 13, 2025Updated 10 months ago
- Implementations of several self-supervised pretext tasks for language and vision modalities in PyTorch.☆13Jan 19, 2021Updated 5 years ago
- ACL 2023 *oral* paper "MGR: Multi-generator based Rationalization"☆10Nov 21, 2024Updated last year
- Kakao Mobility MCP Server for directions and transit information☆10Sep 14, 2025Updated 5 months ago
- Code for "Multi-scale Abstract Reasoning" paper☆10Oct 17, 2022Updated 3 years ago
- Code for "SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields" (ECCV 2024)☆12Oct 30, 2024Updated last year