Ybakman / LLM_UncertaintyLinks

☆11

Alternatives and similar repositories for LLM_Uncertainty

Users that are interested in LLM_Uncertainty are comparing it to the libraries listed below

Sorting:

lorenzkuhn / semantic_uncertainty
☆172Updated last year
TRAIS-Lab / dattri
`dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.
☆84Updated 2 months ago
zlin7 / UQ-NLG
☆99Updated last year
neubig / research-career-tools
☆163Updated 9 months ago
logix-project / logix
AI Logging for Interpretability and Explainability🔬
☆124Updated last year
pomonam / kronfluence
Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvature
☆160Updated 2 months ago
MadryLab / trak
A fast, effective data attribution method for neural networks in PyTorch
☆217Updated 9 months ago
balevinstein / Probes
☆55Updated 2 years ago
davidbau / baukit
☆226Updated last year
fc2869 / lo-fit
LoFiT: Localized Fine-tuning on LLM Representations
☆40Updated 7 months ago
stanfordnlp / axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
☆125Updated 2 months ago
LaMP-Benchmark / LaMP
Codes for papers on Large Language Models Personalization (LaMP)
☆167Updated 6 months ago
Varal7 / conformal-language-modeling
Conformal Language Modeling
☆32Updated last year
abhishekpanigrahi1996 / Skill-Localization-by-grafting
☆51Updated last year
kawine / dataset_difficulty
"Understanding Dataset Difficulty with V-Usable Information" (ICML 2022, outstanding paper)
☆87Updated last year
roeehendel / icl_task_vectors
☆96Updated last year
explanare / ravel
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆53Updated 11 months ago
Thartvigsen / GRACE
[NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
☆79Updated 8 months ago
McGill-NLP / safearena
SafeArena is a benchmark for assessing the harmful capabilities of web agents
☆17Updated 4 months ago
princeton-nlp / corpus-poisoning
[EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156
☆37Updated last year
MiaoXiong2320 / llm-uncertainty
code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"
☆131Updated last year
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆63Updated 9 months ago
lukasberglund / reversal_curse
☆293Updated last year
evandez / relations
How do transformer LMs encode relations?
☆52Updated last year
Ybakman / TruthTorchLM
☆47Updated last month
ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models
This repository collects all relevant resources about interpretability in LLMs
☆370Updated 10 months ago
jacobdunefsky / transcoder_circuits
☆165Updated 9 months ago
IBM / activation-steering
[ICLR 2025] General-purpose activation steering library
☆99Updated last week
nrimsky / CAA
Steering Llama 2 with Contrastive Activation Addition
☆178Updated last year
HITsz-TMG / awesome-llm-attributions
A Survey of Attributions for Large Language Models
☆211Updated last year