zlin7 / UQ-NLGLinks
☆102Updated last year
Alternatives and similar repositories for UQ-NLG
Users that are interested in UQ-NLG are comparing it to the libraries listed below
Sorting:
- ☆179Updated last year
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆81Updated 10 months ago
- ☆40Updated last year
- ☆57Updated 2 years ago
- ☆46Updated last year
- code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"☆133Updated last year
- ☆98Updated 2 years ago
- ☆46Updated last year
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆83Updated 7 months ago
- ☆51Updated last year
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆66Updated 11 months ago
- ☆52Updated 6 months ago
- ☆29Updated last year
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆76Updated last year
- This is the official repo for Towards Uncertainty-Aware Language Agent.☆29Updated last year
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆136Updated 4 months ago
- [ICLR 2025] General-purpose activation steering library☆114Updated last month
- ☆25Updated 4 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆125Updated 11 months ago
- [ACL 2024] Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models☆59Updated last year
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆68Updated 2 years ago
- ☆49Updated 2 years ago
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆31Updated 9 months ago
- Official repository for ICLR 2024 Spotlight paper "Large Language Models Are Not Robust Multiple Choice Selectors"☆41Updated 5 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆61Updated last year
- Restore safety in fine-tuned language models through task arithmetic☆29Updated last year
- AI Logging for Interpretability and Explainability🔬☆130Updated last year
- [NeurIPS 2023 D&B Track] Code and data for paper "Revisiting Out-of-distribution Robustness in NLP: Benchmarks, Analysis, and LLMs Evalua…☆35Updated 2 years ago
- Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)☆13Updated last year
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆63Updated 2 years ago