jlko / semantic_uncertaintyLinks

Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).

☆329

Alternatives and similar repositories for semantic_uncertainty

Users that are interested in semantic_uncertainty are comparing it to the libraries listed below

Sorting:

lorenzkuhn / semantic_uncertainty
☆170Updated last year
RUCAIBox / HaluEval
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
☆479Updated last year
EdinburghNLP / awesome-hallucination-detection
List of papers on hallucination detection in LLMs.
☆896Updated last week
LuckyyySTA / Awesome-LLM-hallucination
LLM hallucination paper list
☆318Updated last year
jlko / long_hallucinations
Codebase for reproducing the experiments of the semantic uncertainty paper (paragraph-length experiments).
☆63Updated last year
voidism / DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
☆498Updated 5 months ago
shmsw25 / FActScore
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…
☆353Updated 2 months ago
HITsz-TMG / awesome-llm-attributions
A Survey of Attributions for Large Language Models
☆203Updated 10 months ago
llm-as-a-judge / Awesome-LLM-as-a-judge
☆363Updated 3 weeks ago
jxzhangjhu / Awesome-LLM-Uncertainty-Reliability-Robustness
Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
☆759Updated last month
alon-albalak / data-selection-survey
A Survey on Data Selection for Language Models
☆237Updated last month
ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models
This repository collects all relevant resources about interpretability in LLMs
☆358Updated 7 months ago
Zhen-Tan-dmml / LLM4Annotation
☆573Updated 3 weeks ago
potsawee / selfcheckgpt
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
☆537Updated last year
wangcunxiang / LLM-Factuality-Survey
The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>
☆340Updated last year
shengliu66 / ICV
Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
☆177Updated 4 months ago
cooperleong00 / Awesome-LLM-Interpretability
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
☆253Updated 3 months ago
OATML / semantic-entropy-probes
☆34Updated 10 months ago
zjunlp / KnowledgeCircuits
[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers
☆148Updated 4 months ago
nelson-liu / lost-in-the-middle
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
☆348Updated last year
andyzoujm / representation-engineering
Representation Engineering: A Top-Down Approach to AI Transparency
☆835Updated 10 months ago
chrisliu298 / awesome-representation-engineering
A resource repository for representation engineering in large language models
☆126Updated 7 months ago
HowieHwong / TrustLLM
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
☆574Updated 3 months ago
HillZhang1999 / llm-hallucination-survey
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large …
☆1,021Updated 7 months ago
StonyBrookNLP / ircot
Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23
☆215Updated last year
teacherpeterpan / self-correction-llm-papers
This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
☆531Updated 7 months ago
ezelikman / STaR
Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)
☆206Updated 2 years ago
likenneth / honest_llama
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
☆530Updated 4 months ago
allenai / reward-bench
RewardBench: the first evaluation tool for reward models.
☆604Updated 2 weeks ago
OpenSafetyLab / SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
☆150Updated 3 months ago