Collection of Reverse Engineering in Large Model
☆36Jan 8, 2025Updated last year
Alternatives and similar repositories for Awesome-Sparse-Autoencoder
Users that are interested in Awesome-Sparse-Autoencoder are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2024] Unveiling the Pitfalls of Knowledge Editing for Large Language Models☆22Jun 13, 2024Updated last year
- Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.☆18Nov 21, 2025Updated 6 months ago
- [NLPCC 2022] Kformer: Knowledge Injection in Transformer Feed-Forward Layers☆39Oct 20, 2022Updated 3 years ago
- Materials for the paper https://arxiv.org/pdf/2007.15036.pdf☆14Aug 3, 2020Updated 5 years ago
- How do transformer LMs encode relations?☆57Feb 24, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆306Jan 22, 2026Updated 4 months ago
- Using sparse coding to find distributed representations used by neural networks.☆304Nov 10, 2023Updated 2 years ago
- ☆103May 11, 2026Updated last week
- ☆28Feb 27, 2025Updated last year
- The Github repo for our survey paper: "Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large…☆131Apr 15, 2026Updated last month
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆239Aug 2, 2024Updated last year
- FeatureAlignment = Alignment + Mechanistic Interpretability☆35Mar 8, 2025Updated last year
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆256Updated this week
- Reproduction Code for Paper "Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models"☆14Jun 1, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆76Mar 6, 2025Updated last year
- Sparsify transformers with cross-layer transcoders☆23Nov 14, 2025Updated 6 months ago
- ☆247Nov 22, 2024Updated last year
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆52Nov 17, 2024Updated last year
- [EMNLP 2025] Circuit-Aware Editing Enables Generalizable Knowledge Learners☆19Nov 17, 2025Updated 6 months ago
- Stable Prediction with Model Misspecification and Agnostic Distribution Shift☆26Apr 26, 2020Updated 6 years ago
- Code for the paper "Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages" (N…☆17Apr 13, 2025Updated last year
- Code and dataset for the paper: "Can Editing LLMs Inject Harm?" [AAAI'26]☆21Dec 26, 2025Updated 4 months ago
- Code for the paper "A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis"☆20Jun 12, 2025Updated 11 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits☆41Jan 8, 2026Updated 4 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆81Jan 16, 2026Updated 4 months ago
- [EMNLP 2023] Knowledge Rumination for Pre-trained Language Models☆17Jun 29, 2023Updated 2 years ago
- The code for paper Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models.☆13Apr 10, 2024Updated 2 years ago
- A curated list of Large Language Model (LLM) Interpretability resources.☆1,496Feb 24, 2026Updated 2 months ago
- Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.☆12Jun 19, 2024Updated last year
- Code for "Automatic Circuit Finding and Faithfulness"☆17Jul 11, 2024Updated last year
- My take on how you should organize your transformer experiments.☆13Apr 13, 2022Updated 4 years ago
- Sparsify transformers with SAEs and transcoders☆720Apr 27, 2026Updated 3 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Exploring Model Kinship for Merging Large Language Models☆28Apr 16, 2025Updated last year
- Repository of GUI Action Narrator☆13Apr 8, 2025Updated last year
- An unofficial Wiki for UM-SJTU JI Dual-Degree Program.☆17Mar 27, 2023Updated 3 years ago
- ☆15Sep 27, 2023Updated 2 years ago
- 一款功能强大的调研问卷系统;有多种题型可供选择,拖动即可生成,支持在线预览,报表查询等。支持题目之间相互跳转和显示。社区版已上线,部分功能正在更新中。。。敬请期待!!!☆10Oct 25, 2024Updated last year
- ☆84Aug 22, 2024Updated last year
- ☆416Aug 21, 2025Updated 9 months ago