ictnlp / TruthXLinks

Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space"

☆147

Alternatives and similar repositories for TruthX

Users that are interested in TruthX are comparing it to the libraries listed below

Sorting:

KbsdJames / Awesome-LLM-Preference-Learning
The official repository of our survey paper: "Towards a Unified View of Preference Learning for Large Language Models: A Survey"
☆184Updated 9 months ago
songmzhang / DSKD
Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…
☆56Updated 9 months ago
getao / icae
The repo for In-context Autoencoder
☆133Updated last year
RUCAIBox / Language-Specific-Neurons
☆79Updated 7 months ago
IAAR-Shanghai / Awesome-Attention-Heads
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
☆355Updated 5 months ago
HillZhang1999 / ICD
Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"
☆68Updated last year
zhaochen0110 / conflictbank
Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets and…
☆46Updated 2 months ago
lancopku / label-words-are-anchors
Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
☆165Updated last year
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆61Updated last year
hahahawu / Long-to-Short-via-Model-Merging
Model merging is a highly efficient approach for long-to-short reasoning.
☆77Updated 2 months ago
zepingyu0512 / neuron-attribution
code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models
☆39Updated 8 months ago
OceannTwT / ra-isf
[ACL 2024] RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback.
☆191Updated 11 months ago
Glaciohound / LM-Steer
Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)
☆123Updated 3 weeks ago
circle-hit / SAPT
Code for ACL 2024 accepted paper titled "SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language …
☆35Updated 6 months ago
LiuAmber / RAHF
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆26Updated 10 months ago
pillowsofwind / Knowledge-Conflicts-Survey
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
☆130Updated 10 months ago
MikaStars39 / FeatureAlignment
FeatureAlignment = Alignment + Mechanistic Interpretability
☆29Updated 5 months ago
zwhe99 / MAPS-mt
[TACL 2024] MAPS enables LLMs🤖 to mimic the human😁 translation process.
☆144Updated last year
wang2226 / Awesome-LLM-Decoding
📜 Paper list on decoding methods for LLMs and LVLMs
☆55Updated last month
zhenyu-02 / LogitLens4LLMs
A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…
☆95Updated 5 months ago
RUCAIBox / HaluEval-2.0
☆46Updated last year
cooperleong00 / Awesome-LLM-Interpretability
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
☆262Updated 4 months ago
LuckyyySTA / Awesome-LLM-hallucination
LLM hallucination paper list
☆321Updated last year
edenbiran / RippleEdits
Evaluating the Ripple Effects of Knowledge Editing in Language Models
☆56Updated last year
deeplearning-wisc / picle
Official code for ICML 2024 paper on Persona In-Context Learning (PICLe)
☆25Updated last year
alisawuffles / proxy-tuning
Code associated with Tuning Language Models by Proxy (Liu et al., 2024)
☆114Updated last year
October2001 / ProLong
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
☆56Updated last year
Jihuai-wpy / InferAligner
☆34Updated 10 months ago
jinzhuoran / RWKU
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024
☆77Updated 10 months ago
mlwu22 / RED
Implementation code for ACL2024：Advancing Parameter Efficiency in Fine-tuning via Representation Editing
☆14Updated last year