AI4LIFE-GROUP / LLM_Explainer
Code for paper: Are Large Language Models Post Hoc Explainers?
☆30Updated 6 months ago
Alternatives and similar repositories for LLM_Explainer:
Users that are interested in LLM_Explainer are comparing it to the libraries listed below
- Conformal Language Modeling☆28Updated last year
- A repository for summaries of recent explainable AI/Interpretable ML approaches☆73Updated 4 months ago
- Code for Language-Interfaced FineTuning for Non-Language Machine Learning Tasks.☆122Updated 3 months ago
- Data and code for the Corr2Cause paper (ICLR 2024)☆93Updated 10 months ago
- ☆83Updated 7 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆62Updated 4 months ago
- ☆124Updated last year
- ☆41Updated 2 weeks ago
- ☆20Updated last year
- [EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156☆29Updated last year
- Influence Analysis and Estimation - Survey, Papers, and Taxonomy☆69Updated 11 months ago
- ☆47Updated last year
- code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"☆100Updated 11 months ago
- Using Explanations as a Tool for Advanced LLMs☆58Updated 5 months ago
- ☆26Updated last year
- The TABLET benchmark for evaluating instruction learning with LLMs for tabular prediction.☆20Updated last year
- A simple PyTorch implementation of influence functions.☆84Updated 8 months ago
- Lightweight Adapting for Black-Box Large Language Models☆19Updated last year
- ☆52Updated last year
- Official Repository for ICML 2023 paper "Can Neural Network Memorization Be Localized?"☆17Updated last year
- A resource repository for representation engineering in large language models☆102Updated 3 months ago
- OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)☆93Updated 2 weeks ago
- ☆20Updated 2 months ago
- Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective☆24Updated 3 weeks ago
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆84Updated last week
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆62Updated 3 months ago
- 🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Con…☆39Updated last year
- ☆30Updated 4 months ago
- A Python Data Valuation Package☆28Updated 2 years ago
- Code for NeurIPS'23 paper "A Bayesian Approach To Analysing Training Data Attribution In Deep Learning"☆15Updated last year