IINemo/lm-polygraph

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IINemo/lm-polygraph)

IINemo / lm-polygraph

☆497

Alternatives and similar repositories for lm-polygraph

Users that are interested in lm-polygraph are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

IINemo / llm-uncertainty-head
View on GitHub
☆26Feb 23, 2026Updated 5 months ago
jinhaoduan / SAR
View on GitHub
[ACL 2024] Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
☆63Sep 4, 2024Updated last year
zlin7 / UQ-NLG
View on GitHub
☆106Jun 30, 2024Updated 2 years ago
AlexanderVNikitin / kernel-language-entropy
View on GitHub
Code for Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities (NeurIPS'24)
☆36Dec 17, 2024Updated last year
IINemo / thinkbooster
View on GitHub
Open-source framework for test-time compute scaling of LLMs. Includes a visual debugger for inspecting reasoning traces and an endpoint t…
☆27Jun 30, 2026Updated 3 weeks ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
stat-ml / alpaca
View on GitHub
Library for active learning and uncertainty estimation in machine learning
☆28Oct 26, 2022Updated 3 years ago
UCSB-NLP-Chang / llm_uncertainty
View on GitHub
☆43Feb 2, 2024Updated 2 years ago
jlko / semantic_uncertainty
View on GitHub
Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).
☆421Apr 12, 2024Updated 2 years ago
jxzhangjhu / Awesome-LLM-Uncertainty-Reliability-Robustness
View on GitHub
Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
☆833Jun 5, 2026Updated last month
WSNLP / al_toolbox
View on GitHub
Active learning
☆78Feb 8, 2023Updated 3 years ago
smartyfh / LLM-Uncertainty-Bench
View on GitHub
Benchmarking LLMs via Uncertainty Quantification
☆263Jan 30, 2024Updated 2 years ago
MaHuanAAA / logtoku
View on GitHub
☆42Aug 21, 2025Updated 11 months ago
s-nlp / Evergreen
View on GitHub
☆23Jun 10, 2025Updated last year
tatsu-lab / linguistic_calibration
View on GitHub
Align your LM to express calibrated verbal statements of confidence in its long-form generations.
☆30Jun 4, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Ybakman / TruthTorchLM
View on GitHub
☆64Feb 20, 2026Updated 5 months ago
EdinburghNLP / awesome-hallucination-detection
View on GitHub
List of papers on hallucination detection in LLMs.
☆1,119Updated this week
s-nlp / AdaRAGUE
View on GitHub
[ACL 2025] Adaptive Retrieval without Self-Knowledge? Bringing Uncertainty Back Home
☆20May 17, 2025Updated last year
s-nlp / PsiloQA
View on GitHub
The PsiloQA pipeline automates the construction of a multilingual, span-level hallucination detection dataset with contexts.
☆16Apr 24, 2026Updated 3 months ago
MiaoXiong2320 / llm-uncertainty
View on GitHub
code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"
☆148Mar 14, 2024Updated 2 years ago
stat-ml / GeoMLE
View on GitHub
This repo contains code for GeoMLE intrinsic dimension estimation algorithm
☆21Jul 10, 2020Updated 6 years ago
dialogue-evaluation / taxonomy-enrichment
View on GitHub
Dialogue Evaluation 2020: Taxonomy Enrichment for the Russian Language
☆12Nov 7, 2020Updated 5 years ago
zthang / Focus
View on GitHub
☆24Feb 3, 2024Updated 2 years ago
shmsw25 / FActScore
View on GitHub
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…
☆450Apr 13, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
yuzhaouoe / SAE-based-representation-engineering
View on GitHub
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆83Jun 20, 2026Updated last month
powpowengineering / MeteoStation_v2
View on GitHub
new version of weather station. one based on stm32
☆14Nov 5, 2023Updated 2 years ago
moussaKam / FrugalScore
View on GitHub
FrugalScore is an approach to learn a fixed, low cost version of any expensive NLG metric, while retaining most of its original performan…
☆16Sep 21, 2022Updated 3 years ago
balevinstein / Probes
View on GitHub
☆58Jun 30, 2023Updated 3 years ago
Kaleidophon / nlp-uncertainty-zoo
View on GitHub
Model zoo for different kinds of uncertainty quantification methods used in Natural Language Processing, implemented in PyTorch.
☆55May 5, 2023Updated 3 years ago
ZBox1005 / CoT-UQ
View on GitHub
[ACL 2025] "CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought"
☆17Apr 3, 2025Updated last year
Jiuzhouh / Uncertainty-Aware-Language-Agent
View on GitHub
This is the official repo for Towards Uncertainty-Aware Language Agent.
☆31Aug 15, 2024Updated last year
MBZUAI-Paris / MixtureKit
View on GitHub
MixtureKit: A General Framework for Composing, Training, and Visualizing Mixture-of-Experts Models
☆33May 25, 2026Updated 2 months ago
apple / ml-selfreflect
View on GitHub
☆45Sep 30, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lingchen0331 / UQ_ICL
View on GitHub
Uncertainty quantification for in-context learning of large language models
☆15Apr 1, 2024Updated 2 years ago
OATML / semantic-entropy-probes
View on GitHub
☆65Jul 12, 2026Updated 2 weeks ago
swiss-ai / parity-aware-bpe
View on GitHub
Parity-Aware Byte-Pair Encoding: Improving Cross-lingual Fairness in Tokenization [ACL 2026]
☆20Apr 18, 2026Updated 3 months ago
RUCAIBox / HaluEval
View on GitHub
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
☆592Feb 12, 2024Updated 2 years ago
MiaoXiong2320 / ProximityBias-Calibration
View on GitHub
☆19Nov 11, 2023Updated 2 years ago
D2I-ai / eigenscore
View on GitHub
☆46Dec 9, 2024Updated last year
acl-org / arr-health
View on GitHub
Monitoring the health of ARR
☆32Apr 4, 2026Updated 3 months ago