hkust-nlp/Activation_Decoding

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hkust-nlp/Activation_Decoding)

hkust-nlp / Activation_Decoding

In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)

☆63

Alternatives and similar repositories for Activation_Decoding

Users that are interested in Activation_Decoding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ictnlp / TACS
View on GitHub
Source code for Truth-Aware Context Selection: Mitigating the Hallucinations of Large Language Models Being Misled by Untruthful Contexts
☆17Sep 2, 2024Updated last year
HillZhang1999 / ICD
View on GitHub
Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"
☆71Feb 27, 2024Updated 2 years ago
voidism / DoLa
View on GitHub
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
☆556Jul 12, 2026Updated last week
aryopg / decore
View on GitHub
Official Implementation of "DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucination"
☆30Dec 18, 2024Updated last year
hkust-nlp / felm
View on GitHub
Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
☆65Dec 25, 2023Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
fanqiwan / KCA
View on GitHub
EMNLP'2024: Knowledge Verification to Nip Hallucination in the Bud
☆23Mar 10, 2024Updated 2 years ago
yuezih / less-is-more
View on GitHub
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
☆58Oct 28, 2024Updated last year
koalazf99 / nanoverl
View on GitHub
Collections of RLxLM experiments using minimal codes
☆14Feb 17, 2025Updated last year
mandyyyyii / east
View on GitHub
☆19Aug 4, 2025Updated 11 months ago
GAIR-NLP / alignment-for-honesty
View on GitHub
☆78May 22, 2024Updated 2 years ago
GAIR-NLP / Safety-J
View on GitHub
Safety-J: Evaluating Safety with Critique
☆16Jul 28, 2024Updated last year
yifeiwang77 / Self-Correction
View on GitHub
☆20Nov 3, 2024Updated last year
GAIR-NLP / self-improvement-reversal
View on GitHub
☆13Jul 14, 2024Updated 2 years ago
tmlr-group / NoisyRationales
View on GitHub
[NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"
☆40Jul 18, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
RUCAIBox / HaluEval
View on GitHub
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
☆592Feb 12, 2024Updated 2 years ago
PAIR-code / interpretability
View on GitHub
PAIR.withgoogle.com and friend's work on interpretability methods
☆234Jun 22, 2026Updated 3 weeks ago
MiaoXiong2320 / llm-uncertainty
View on GitHub
code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"
☆148Mar 14, 2024Updated 2 years ago
tatsu-lab / linguistic_calibration
View on GitHub
Align your LM to express calibrated verbal statements of confidence in its long-form generations.
☆29Jun 4, 2024Updated 2 years ago
likenneth / honest_llama
View on GitHub
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
☆581Jan 28, 2025Updated last year
XiangLi1999 / ContrastiveDecoding
View on GitHub
contrastive decoding
☆206Nov 14, 2022Updated 3 years ago
GAIR-NLP / OPO
View on GitHub
☆50Mar 2, 2024Updated 2 years ago
rhyang2021 / ARIA
View on GitHub
Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".
☆30Aug 9, 2025Updated 11 months ago
DiLi-Lab / ScanDL
View on GitHub
☆14Apr 29, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hkust-nlp / model-task-align-rl
View on GitHub
[ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".
☆18Feb 9, 2026Updated 5 months ago
YiyangZhou / LURE
View on GitHub
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆158Apr 30, 2024Updated 2 years ago
deeplearning-wisc / haloscope
View on GitHub
source code for NeurIPS'24 paper "HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection"
☆70Apr 11, 2025Updated last year
GAIR-NLP / MetaCritique
View on GitHub
Evaluate the Quality of Critique
☆37Jun 1, 2024Updated 2 years ago
GAIR-NLP / Entropy-ABF
View on GitHub
Official implementation for 'Extending LLMs’ Context Window with 100 Samples'
☆82Jan 18, 2024Updated 2 years ago
mrwu-mac / R-Bench
View on GitHub
[ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'
☆24Jan 1, 2025Updated last year
yegcjs / DINOISER
View on GitHub
☆26Jul 15, 2025Updated last year
Model-GLUE / Model-GLUE
View on GitHub
☆18Aug 19, 2024Updated last year
GraySwanAI / circuit-breakers
View on GitHub
Improving Alignment and Robustness with Circuit Breakers
☆266Sep 24, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ictnlp / TruthX
View on GitHub
Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space"
☆144Mar 26, 2024Updated 2 years ago
vfleaking / PTST
View on GitHub
Code for safety test in "Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates"
☆22Sep 21, 2025Updated 10 months ago
yhcc / utcie
View on GitHub
This is the code repo for the paper <UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction>
☆15Aug 10, 2023Updated 2 years ago
LzVv123456 / I2CL
View on GitHub
☆41May 24, 2024Updated 2 years ago
explanare / ravel
View on GitHub
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆58Oct 30, 2025Updated 8 months ago
mt-upc / logit-explanations
View on GitHub
☆18Jun 19, 2023Updated 3 years ago
EleutherAI / semantic-memorization
View on GitHub
☆44Nov 17, 2024Updated last year