Trustworthy-Information-Access / LLM-Knowledge-Boundary-Perception-via-Internal-StatesLinks

Official code for the paper Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception. The code is based on transformers.

☆15

Alternatives and similar repositories for LLM-Knowledge-Boundary-Perception-via-Internal-States

Users that are interested in LLM-Knowledge-Boundary-Perception-via-Internal-States are comparing it to the libraries listed below

Sorting:

D2I-ai / eigenscore
☆30Updated 7 months ago
ShiyuNee / Awesome-LMs-Perception-of-Their-Knowledge-Boundaries-Papers
This is a repo consisting of papers about LLMs' perception of their knowledge boundaries
☆13Updated 2 months ago
AmourWaltz / Reliable-LLM
☆144Updated 10 months ago
zhenyu-02 / LogitLens4LLMs
A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…
☆93Updated 5 months ago
DaoD / ResearchFigure
Some example codes for drawing figures in research paper
☆34Updated 3 years ago
shizhl / Multi-Agent-Papers
The awesome agents in the era of large language models
☆65Updated last year
GAIR-NLP / Safety-J
Safety-J: Evaluating Safety with Critique
☆16Updated 11 months ago
zepingyu0512 / awesome-SAE
awesome SAE papers
☆39Updated last month
mjy1111 / BAKE
This is the repository for our paper: Untying the Reversal Curse via Bidirectional Language Model Editing
☆11Updated last month
RUCAIBox / Language-Specific-Neurons
☆75Updated 6 months ago
PolarisRisingWar / Math_Word_Problem_Collection
A collection for math word problem (MWP) works, including datasets, algorithms and so on.
☆44Updated last year
UM-FAH-Yuan / FIE2025
☆15Updated last month
pillowsofwind / Knowledge-Conflicts-Survey
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
☆127Updated 10 months ago
cooperleong00 / Awesome-LLM-Interpretability
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
☆258Updated 4 months ago
oneal2000 / MIND
Source code of our paper MIND, ACL 2024 Long Paper
☆44Updated last year
lancopku / label-words-are-anchors
Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
☆165Updated last year
princeton-nlp / MQuAKE
[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions
☆114Updated 10 months ago
Hongcheng-Gao / Awesome-Long2short-on-LRMs
Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…
☆235Updated last month
Arvid-pku / ATOKE
[AAAI 2024] History Matters: Temporal Knowledge Editing in Large Language Model
☆13Updated last year
junzhuang-code / LLMSurveySummary
A collection of survey papers and resources related to Large Language Models (LLMs).
☆40Updated last year
fanqiwan / Explore-Instruct
EMNLP'2023: Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration
☆36Updated last year
zepingyu0512 / awesome-LLM-neuron
☆25Updated last month
zepingyu0512 / awesome-llm-understanding-mechanism
awesome papers in LLM interpretability
☆522Updated last month
wangcunxiang / LLM-Factuality-Survey
The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>
☆340Updated last year
zzhang0179 / Unveiling-Linguistic-Regions-in-LLMs
[ACL 2024] Unveiling Linguistic Regions in Large Language Models
☆31Updated last year
GAIR-NLP / alignment-for-honesty
☆74Updated last year
THUNLP-MT / PromptGating4MCTG
This is the repo for our work “An Extensible Plug-and-Play Method for Multi-Aspect Controllable Text Generation” (ACL 2023).
☆13Updated last year
MikaStars39 / FeatureAlignment
FeatureAlignment = Alignment + Mechanistic Interpretability
☆28Updated 4 months ago
zepingyu0512 / neuron-attribution
code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models
☆38Updated 8 months ago
RUCAIBox / HaluEval-2.0
☆45Updated last year