☆33Dec 17, 2025Updated 2 months ago
Alternatives and similar repositories for VeriScore
Users that are interested in VeriScore are comparing it to the libraries listed below
Sorting:
- Repository for DEMETR: Diagnosing Evaluation Metrics for Translation☆17Nov 29, 2022Updated 3 years ago
- codes for "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models"☆12Feb 10, 2025Updated last year
- ☆18Dec 1, 2024Updated last year
- ☆11May 1, 2022Updated 3 years ago
- FactScoreLite is an implementation of the FactScore metric, designed for detailed accuracy assessment in text generation. This package bu…☆13Apr 25, 2024Updated last year
- ☆12Sep 1, 2021Updated 4 years ago
- ☆16Dec 10, 2022Updated 3 years ago
- ☆15Aug 3, 2021Updated 4 years ago
- An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"☆16Nov 5, 2024Updated last year
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20May 14, 2022Updated 3 years ago
- [EMNLP-2025] R1-Zero on ANY TASK☆27Nov 9, 2025Updated 3 months ago
- ☆21Jul 28, 2022Updated 3 years ago
- ☆55Mar 27, 2023Updated 2 years ago
- ☆54Oct 24, 2024Updated last year
- Codebase for LLM Textual Hallucination Benchmark☆74Apr 25, 2025Updated 10 months ago
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆63Dec 25, 2023Updated 2 years ago
- ☆29Dec 2, 2024Updated last year
- ☆71Nov 27, 2024Updated last year
- Official repository for "PostMark: A Robust Blackbox Watermark for Large Language Models"☆27Aug 30, 2024Updated last year
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…☆417Apr 13, 2025Updated 10 months ago
- ☆39Jun 7, 2023Updated 2 years ago
- 用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information(ACL2021)☆10Nov 15, 2021Updated 4 years ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Aug 10, 2024Updated last year
- About The corresponding code from our paper " Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning…☆13Jan 14, 2026Updated last month
- ☆12Jul 25, 2023Updated 2 years ago
- 武大 编译原理实践课作业,详细说明见pdf☆10Jun 14, 2017Updated 8 years ago
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- A python implementation of PSNR that takes the Human visual system into account.☆13Jul 6, 2023Updated 2 years ago
- ☆11Jun 5, 2024Updated last year
- Code for "Towards Robust k-Nearest-Neighbor Machine Translation" (EMNLP 2022)☆12Oct 18, 2022Updated 3 years ago
- ☆26Nov 7, 2022Updated 3 years ago
- Wenzhou-Kean University AI-LAB☆10Jun 6, 2022Updated 3 years ago
- ☆13Sep 26, 2024Updated last year
- This repository provides the dataset used in "Schema-Guided Natural Language Generation" by Yuheng Du, Shereen Oraby, Vittorio Perera, Mi…☆13Dec 8, 2020Updated 5 years ago
- ☆10May 27, 2024Updated last year
- RIBES is an automatic evaluation metric for machine translation.☆11Sep 7, 2017Updated 8 years ago