Locating and editing factual associations in GPT (NeurIPS 2022)
☆730Apr 20, 2024Updated last year
Alternatives and similar repositories for rome
Users that are interested in rome are comparing it to the libraries listed below
Sorting:
- Mass-editing thousands of facts into a transformer memory (ICLR 2023)☆541Jan 31, 2024Updated 2 years ago
- [ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.☆2,723Feb 9, 2026Updated 3 weeks ago
- MEND: Fast Model Editing at Scale☆257Aug 30, 2023Updated 2 years ago
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆61May 9, 2023Updated 2 years ago
- Code for the ACL-2022 paper "Knowledge Neurons in Pretrained Transformers"☆174May 4, 2024Updated last year
- Must-read Papers on Knowledge Editing for Large Language Models.☆1,217Jul 12, 2025Updated 7 months ago
- 🩹Editing large language models within 10 seconds⚡☆1,360Aug 13, 2023Updated 2 years ago
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆71Nov 1, 2022Updated 3 years ago
- Code for Editing Factual Knowledge in Language Models☆142Jan 28, 2022Updated 4 years ago
- A library for mechanistic interpretability of GPT-style language models☆3,133Updated this week
- ☆273Oct 1, 2024Updated last year
- The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆99Sep 5, 2021Updated 4 years ago
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆119Sep 12, 2024Updated last year
- Inspecting and Editing Knowledge Representations in Language Models☆119Jul 24, 2023Updated 2 years ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆83Dec 21, 2024Updated last year
- The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.☆184May 13, 2022Updated 3 years ago
- A library for finding knowledge neurons in pretrained transformer models.☆159Feb 13, 2022Updated 4 years ago
- Stanford NLP Python library for understanding and improving PyTorch models via interventions☆863Jan 29, 2026Updated last month
- Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"☆541Jan 17, 2025Updated last year
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆825Feb 23, 2026Updated last week
- Inference-Time Intervention: Eliciting Truthful Answers from a Language Model☆572Jan 28, 2025Updated last year
- Training Sparse Autoencoders on Language Models☆1,233Updated this week
- Tools for understanding how transformer predictions are built layer-by-layer☆567Aug 7, 2025Updated 6 months ago
- ☆284Mar 2, 2024Updated 2 years ago
- ☆41Nov 30, 2023Updated 2 years ago
- ☆1,072Mar 6, 2024Updated last year
- Evaluating the Ripple Effects of Knowledge Editing in Language Models☆56Apr 15, 2024Updated last year
- ☆68May 18, 2023Updated 2 years ago
- How do transformer LMs encode relations?☆56Feb 24, 2024Updated 2 years ago
- Mechanistic Interpretability Visualizations using React☆326Dec 18, 2024Updated last year
- Erasing concepts from neural representations with provable guarantees☆243Jan 27, 2025Updated last year
- ☆29Apr 30, 2024Updated last year
- The hub for EleutherAI's work on interpretability and learning dynamics☆2,739Nov 15, 2025Updated 3 months ago
- Steering Llama 2 with Contrastive Activation Addition☆212May 23, 2024Updated last year
- ☆52Oct 23, 2023Updated 2 years ago
- ☆571Jul 19, 2024Updated last year
- ☆209Oct 14, 2025Updated 4 months ago
- Representation Engineering: A Top-Down Approach to AI Transparency☆953Aug 14, 2024Updated last year
- 👩💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"☆20Jan 19, 2024Updated 2 years ago