JacksonWuxs / UsableXAI_LLMLinks

Using Explanations as a Tool for Advanced LLMs

☆66

Alternatives and similar repositories for UsableXAI_LLM

Users that are interested in UsableXAI_LLM are comparing it to the libraries listed below

Sorting:

zjunlp / KnowledgeCircuits
[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers
☆151Updated 5 months ago
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆61Updated last year
hy-zhao23 / Explainability-for-Large-Language-Models
☆148Updated last year
deeplearning-wisc / haloscope
source code for NeurIPS'24 paper "HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection"
☆52Updated 3 months ago
MiaoXiong2320 / llm-uncertainty
code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"
☆125Updated last year
Thartvigsen / GRACE
[NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
☆78Updated 7 months ago
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆62Updated 8 months ago
weizhepei / InstructRAG
[ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
☆114Updated 6 months ago
tatsu-lab / test_set_contamination
☆39Updated last year
yubol-bobo / Awesome-Multi-Turn-LLMs
This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …
☆91Updated 2 months ago
Zayne-sprague / To-CoT-or-not-to-CoT
☆26Updated 4 months ago
fc2869 / lo-fit
LoFiT: Localized Fine-tuning on LLM Representations
☆39Updated 6 months ago
declare-lab / trust-align
Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…
☆63Updated 5 months ago
zhiyuanhubj / UoT
[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models
☆101Updated last year
shengliu66 / ICV
Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
☆182Updated 5 months ago
jinzhuoran / RWKU
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024
☆77Updated 10 months ago
LiuAmber / RAHF
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆26Updated 10 months ago
alisawuffles / proxy-tuning
Code associated with Tuning Language Models by Proxy (Liu et al., 2024)
☆114Updated last year
zepingyu0512 / neuron-attribution
code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models
☆39Updated 8 months ago
jlko / long_hallucinations
Codebase for reproducing the experiments of the semantic uncertainty paper (paragraph-length experiments).
☆67Updated last year
Lordog / R-Judge
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)
☆84Updated 2 months ago
ZFancy / awesome-activation-engineering
A curated list of resources for activation engineering
☆99Updated 2 months ago
alon-albalak / data-selection-survey
A Survey on Data Selection for Language Models
☆246Updated 3 months ago
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆81Updated last month
jinhaoduan / SAR
[ACL 2024] Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
☆53Updated 11 months ago
JasonForJoy / Model-Editing-Hurt
EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue
☆35Updated 2 months ago
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆127Updated 4 months ago
yaojin17 / Unlearning_LLM
[ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"
☆59Updated 10 months ago
MingLiiii / Layer_Gradient
[ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
☆71Updated last month
prateeky2806 / ties-merging
☆185Updated last year