AI4LIFE-GROUP / LLM_ExplainerLinks

Code for paper: Are Large Language Models Post Hoc Explainers?

☆34

Alternatives and similar repositories for LLM_Explainer

Users that are interested in LLM_Explainer are comparing it to the libraries listed below

Sorting:

Varal7 / conformal-language-modeling
Conformal Language Modeling
☆32Updated last year
ZaydH / influence_analysis_papers
Influence Analysis and Estimation - Survey, Papers, and Taxonomy
☆83Updated last year
opendataval / opendataval
OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)
☆99Updated 8 months ago
princeton-nlp / corpus-poisoning
[EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156
☆40Updated last year
UCSB-NLP-Chang / llm_uncertainty
☆40Updated last year
zlin7 / UQ-NLG
☆102Updated last year
tatsu-lab / conformal-factual-lm
☆33Updated last year
MiaoXiong2320 / llm-uncertainty
code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"
☆135Updated last year
ykwon0407 / DataInf
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)
☆76Updated last year
UW-Madison-Lee-Lab / LanguageInterfacedFineTuning
Code for Language-Interfaced FineTuning for Non-Language Machine Learning Tasks.
☆132Updated 11 months ago
ajyl / dpo_toxic
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
☆83Updated 7 months ago
JacksonWuxs / UsableXAI_LLM
Using Explanations as a Tool for Advanced LLMs
☆67Updated last year
rushrukh / awesome-explainable-ai
A repository for summaries of recent explainable AI/Interpretable ML approaches
☆84Updated last year
jjcherian / conformal-safety
☆32Updated 11 months ago
MadryLab / trak
A fast, effective data attribution method for neural networks in PyTorch
☆220Updated 11 months ago
balevinstein / Probes
☆57Updated 2 years ago
chrisliu298 / awesome-representation-engineering
A resource repository for representation engineering in large language models
☆139Updated 11 months ago
i-gallegos / Fair-LLM-Benchmark
☆155Updated 2 years ago
lucweytingh / ARL-UvA
A reproduced PyTorch implementation of the Adversarially Reweighted Learning (ARL) model, originally presented in "Fairness without Demog…
☆20Updated 4 years ago
ahxt / fair_fairness_benchmark
FFB: A Fair Fairness Benchmark for In-Processing Group Fairness Methods.
☆30Updated last year
Vaidehi99 / InfoDeletionAttacks
☆46Updated 8 months ago
causalNLP / corr2cause
Data and code for the Corr2Cause paper (ICLR 2024)
☆111Updated last year
deeplearning-wisc / haloscope
source code for NeurIPS'24 paper "HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection"
☆60Updated 6 months ago
daviddao / awesome-data-valuation
💱 A curated list of data valuation (DV) to design your next data marketplace
☆129Updated 8 months ago
pratyushmaini / localizing-memorization
Official Repository for ICML 2023 paper "Can Neural Network Memorization Be Localized?"
☆20Updated 2 years ago
weitianxin / awesome-distribution-shift
A curated list of papers and resources about the distribution shift in machine learning.
☆123Updated 2 years ago
dylan-slack / Tablet
The TABLET benchmark for evaluating instruction learning with LLMs for tabular prediction.
☆22Updated 2 years ago
uvanlp / valda
A Python Data Valuation Package
☆30Updated 2 years ago
jamqd / Group-Preference-Optimization
☆21Updated last year
alstonlo / torch-influence
A simple PyTorch implementation of influence functions.
☆91Updated last year