evandez/REMEDI

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/evandez/REMEDI)

evandez / REMEDI

Inspecting and Editing Knowledge Representations in Language Models

☆120

Alternatives and similar repositories for REMEDI

Users that are interested in REMEDI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

evandez / relations
View on GitHub
How do transformer LMs encode relations?
☆60Feb 24, 2024Updated 2 years ago
ajyl / dpo_toxic
View on GitHub
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
☆90Mar 7, 2025Updated last year
kmeng01 / rome
View on GitHub
Locating and editing factual associations in GPT (NeurIPS 2022)
☆770Apr 20, 2024Updated 2 years ago
kmeng01 / memit
View on GitHub
Mass-editing thousands of facts into a transformer memory (ICLR 2023)
☆556Jan 31, 2024Updated 2 years ago
chicosirius / think-or-not
View on GitHub
☆22May 23, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
montemac / activation_additions
View on GitHub
Algebraic value editing in pretrained language models
☆71Nov 1, 2023Updated 2 years ago
google / belief-localization
View on GitHub
This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…
☆62May 9, 2023Updated 3 years ago
mjy1111 / BAKE
View on GitHub
This is the repository for our paper: Untying the Reversal Curse via Bidirectional Language Model Editing
☆11May 25, 2025Updated last year
davidbau / baukit
View on GitHub
☆257Feb 22, 2024Updated 2 years ago
eric-mitchell / mend
View on GitHub
MEND: Fast Model Editing at Scale
☆259Aug 30, 2023Updated 2 years ago
stanfordnlp / pyvene
View on GitHub
Stanford NLP Python library for understanding and improving PyTorch models via interventions
☆893Mar 6, 2026Updated 4 months ago
s-ball-10 / jailbreak_dynamics
View on GitHub
☆25Jun 13, 2024Updated 2 years ago
HLTCHKUST / KnowExpert
View on GitHub
The implementation of the paper "Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters".
☆17May 24, 2022Updated 4 years ago
ericwtodd / function_vectors
View on GitHub
Function Vectors in Large Language Models (ICLR 2024)
☆199Apr 30, 2026Updated 2 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
safety-research / believe-it-or-not
View on GitHub
Code and data for editing model beliefs with SDF and other methods, and for evaluating the depth of the implanted beliefs.
☆16Oct 23, 2025Updated 9 months ago
Zhang-Yihao / Adversarial-Representation-Engineering
View on GitHub
Official implementation repository for the paper Towards General Conceptual Model Editing via Adversarial Representation Engineering.
☆20Dec 6, 2024Updated last year
mega002 / lm-debugger
View on GitHub
The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.
☆186May 13, 2022Updated 4 years ago
zjunlp / FactCHD
View on GitHub
[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
☆90Apr 28, 2024Updated 2 years ago
ZeroYuHuang / Transformer-Patcher
View on GitHub
☆34Aug 5, 2023Updated 2 years ago
zjunlp / KnowledgeEditingPapers
View on GitHub
Must-read Papers on Knowledge Editing for Large Language Models.
☆1,242Jun 25, 2026Updated last month
peterbhase / LAS-NL-Explanations
View on GitHub
Code for paper "Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?"
☆21Oct 13, 2020Updated 5 years ago
vr25 / hallucination-foundation-model-survey
View on GitHub
A Survey of Hallucination in Large Foundation Models
☆56Jan 10, 2024Updated 2 years ago
scalable-model-editing / unified-model-editing
View on GitHub
We introduce EMMET and unify model editing with popular algorithms ROME and MEMIT.
☆29Dec 16, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
likenneth / honest_llama
View on GitHub
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
☆581Jan 28, 2025Updated last year
xpq-tech / PMET
View on GitHub
This is a repository for "PMET: Precise Model Editing in a Transformer"
☆58Sep 28, 2023Updated 2 years ago
feyzaakyurek / dune
View on GitHub
Dataset for Unified Editing, EMNLP 2023. This is a model editing dataset where edits are natural language phrases.
☆24Sep 4, 2024Updated last year
McGill-NLP / AdversarialTriggers
View on GitHub
TACL 2025: Investigating Adversarial Trigger Transfer in Large Language Models
☆19Aug 17, 2025Updated 11 months ago
petezh / OpenD5
View on GitHub
Tasks for describing differences between text distributions.
☆17Aug 9, 2024Updated last year
DanielSc4 / Dynamic-Activation-Composition
View on GitHub
Materials for "Multi-property Steering of Large Language Models with Dynamic Activation Composition"
☆14Nov 22, 2024Updated last year
dreasysnail / CoCon
View on GitHub
Consistent dialogue generation
☆16Oct 26, 2022Updated 3 years ago
milesaturpin / cot-unfaithfulness
View on GitHub
☆57Oct 23, 2023Updated 2 years ago
john-hewitt / control-tasks
View on GitHub
Repository describing example random control tasks for designing and interpreting neural probes
☆32Jun 21, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
likenneth / othello_world
View on GitHub
Emergent world representations: Exploring a sequence model trained on a synthetic task
☆212Jul 12, 2023Updated 3 years ago
Hi-ZenanXu / Syntax-Enhanced_Pre-trained_Model
View on GitHub
Source Data of ACL2021 paper "Syntax-Enhanced Pre-trained Model"
☆11Jun 1, 2021Updated 5 years ago
michaelsaxon / CoCoCroLa
View on GitHub
The Conceptual Coverage Across Languages Benchmark for Text-to-Image Models
☆12Oct 28, 2024Updated last year
MaheepChaudhary / SAE-Ravel
View on GitHub
Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…
☆13Jan 26, 2025Updated last year
princeton-nlp / Edge-Pruning
View on GitHub
[NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".
☆70Aug 15, 2025Updated 11 months ago
Hannibal046 / PlugLM
View on GitHub
[ACL2023] Source code for Decouple knowledge from paramters for plug-and-play language modeling
☆20Sep 18, 2023Updated 2 years ago
loriqing / Label-Reasoning-Network
View on GitHub
code for "Fine-grained Entity Typing via Label Reasoning" EMNLP2021
☆13May 27, 2022Updated 4 years ago