LoveCatc / supervised-llm-uncertainty-estimation
This repo contains code for paper: "Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach".
☆14Updated 4 months ago
Alternatives and similar repositories for supervised-llm-uncertainty-estimation:
Users that are interested in supervised-llm-uncertainty-estimation are comparing it to the libraries listed below
- ☆46Updated 7 months ago
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆66Updated 8 months ago
- Data and code for the Corr2Cause paper (ICLR 2024)☆94Updated 10 months ago
- LoFiT: Localized Fine-tuning on LLM Representations☆34Updated last month
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆52Updated 11 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆50Updated 3 months ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆67Updated 11 months ago
- Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models☆91Updated 7 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆101Updated 11 months ago
- ☆47Updated last year
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆106Updated 11 months ago
- Algebraic value editing in pretrained language models☆63Updated last year
- ☆89Updated last year
- Function Vectors in Large Language Models (ICLR 2024)☆142Updated 5 months ago
- Implementation of PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆33Updated 4 months ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆73Updated 2 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆100Updated 5 months ago
- ☆84Updated 8 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆76Updated last week
- This is the official repo for Towards Uncertainty-Aware Language Agent.☆24Updated 6 months ago
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆56Updated 3 months ago
- Exploring the Limitations of Large Language Models on Multi-Hop Queries☆24Updated last week
- ☆59Updated 6 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆113Updated 3 months ago
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆166Updated last month
- AbstainQA, ACL 2024☆25Updated 5 months ago
- Critique-out-Loud Reward Models☆53Updated 4 months ago
- Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".☆70Updated last year
- ☆26Updated 8 months ago