ssu-humane / HerOLinks
The code for HerO: a fact-checking pipeline based on open LLMs (the runner-up in AVeriTeC)
☆10Updated 4 months ago
Alternatives and similar repositories for HerO
Users that are interested in HerO are comparing it to the libraries listed below
Sorting:
- Source Code of Paper "GPTScore: Evaluate as You Desire"☆254Updated 2 years ago
- ☆79Updated 7 months ago
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…☆366Updated 3 months ago
- ☆61Updated 8 months ago
- Multilingual Large Language Models Evaluation Benchmark☆128Updated 11 months ago
- Tools for checking ACL paper submissions☆770Updated 2 months ago
- ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.☆143Updated 7 months ago
- This is a repository for sharing papers in the field of persona-based conversational AI. The related source code for each paper is linked…☆165Updated last year
- Official code for the ACL 2024 paper: Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New …☆52Updated last year
- Data and info for the paper "ParaDetox: Text Detoxification with Parallel Data"☆31Updated 4 months ago
- Awesome LLM for NLG Evaluation Papers☆24Updated last year
- BARTScore: Evaluating Generated Text as Text Generation☆357Updated 3 years ago
- Code for paper "G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment"☆366Updated last year
- Codebase, data and models for the SummaC paper in TACL☆98Updated 6 months ago
- ☆141Updated last year
- Code and data for Marked Personas (ACL 2023)☆27Updated 2 years ago
- The implementation of "RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question" [ACL 2023]☆16Updated last year
- This repository contains the data and code introduced in the paper "CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Maske…☆123Updated last year
- Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation☆206Updated last year
- Unofficial re-implementation of "Trusting Your Evidence: Hallucinate Less with Context-aware Decoding"☆30Updated 8 months ago
- ☆243Updated last year
- ☆184Updated last month
- This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.☆498Updated last year
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆192Updated 8 months ago
- SemEval2024-task8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection☆76Updated last year
- Code and data for Koo et al's ACL 2024 paper "Benchmarking Cognitive Biases in Large Language Models as Evaluators"☆21Updated last year
- ☆45Updated last year
- Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper☆79Updated 4 years ago
- ☆17Updated last year
- Code for the paper "You Truly Understand What I Need : Intellectual and Friendly Dialogue Agents grounding Knowledge and Persona" which i…☆23Updated 2 years ago