zlin7 / UQ-NLGLinks

☆102

Alternatives and similar repositories for UQ-NLG

Users that are interested in UQ-NLG are comparing it to the libraries listed below

Sorting:

lorenzkuhn / semantic_uncertainty
☆180Updated last year
UCSB-NLP-Chang / llm_uncertainty
☆40Updated last year
Thartvigsen / GRACE
[NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
☆82Updated 11 months ago
balevinstein / Probes
☆57Updated 2 years ago
MiaoXiong2320 / llm-uncertainty
code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"
☆137Updated last year
ajyl / dpo_toxic
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
☆84Updated 8 months ago
ykwon0407 / DataInf
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)
☆75Updated last year
roeehendel / icl_task_vectors
☆101Updated 2 years ago
OATML / semantic-entropy-probes
☆46Updated last year
dannyallover / overthinking_the_truth
☆29Updated last year
deeplearning-wisc / args
☆46Updated last year
activatedgeek / calibration-tuning
☆52Updated 7 months ago
logix-project / logix
AI Logging for Interpretability and Explainability🔬
☆133Updated last year
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆62Updated last year
milesaturpin / cot-unfaithfulness
☆51Updated 2 years ago
abhishekpanigrahi1996 / Skill-Localization-by-grafting
☆51Updated last year
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆67Updated 11 months ago
MaheepChaudhary / SAE-Ravel
Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…
☆12Updated 9 months ago
ericwtodd / function_vectors
Function Vectors in Large Language Models (ICLR 2024)
☆184Updated 7 months ago
IBM / activation-steering
[ICLR 2025] General-purpose activation steering library
☆119Updated 2 months ago
stanfordnlp / axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
☆141Updated 4 months ago
tatsu-lab / linguistic_calibration
Align your LM to express calibrated verbal statements of confidence in its long-form generations.
☆27Updated last year
mega002 / ff-layers
The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…
☆99Updated 4 years ago
deeplearning-wisc / haloscope
source code for NeurIPS'24 paper "HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection"
☆61Updated 7 months ago
launchnlp / LitCab
☆25Updated 5 months ago
chrisliu298 / awesome-representation-engineering
A resource repository for representation engineering in large language models
☆140Updated last year
chujiezheng / LLM-MCQ-Bias
Official repository for ICLR 2024 Spotlight paper "Large Language Models Are Not Robust Multiple Choice Selectors"
☆42Updated 6 months ago
javiferran / sae_entities
☆63Updated 8 months ago
princeton-nlp / MQuAKE
[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions
☆118Updated last year
eric-mitchell / serac
Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model
☆69Updated 3 years ago