chongyangtao / LLMs-for-NLG-Evaluation
Awesome LLM for NLG Evaluation Papers
☆23Updated 11 months ago
Alternatives and similar repositories for LLMs-for-NLG-Evaluation:
Users that are interested in LLMs-for-NLG-Evaluation are comparing it to the libraries listed below
- First explanation metric (diagnostic report) for text generation evaluation☆62Updated 6 months ago
- Code, datasets, and checkpoints for the paper "Improving Passage Retrieval with Zero-Shot Question Generation (EMNLP 2022)"☆99Updated 2 years ago
- ☆17Updated 11 months ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆64Updated 9 months ago
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆57Updated last year
- 🌲 Code for our EMNLP 2023 paper - 🎄 "Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Mode…☆48Updated last year
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision☆83Updated 2 months ago
- ☆47Updated 9 months ago
- ☆52Updated 4 months ago
- This code accompanies the paper DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering.☆17Updated last year
- Open-WikiTable :Dataset for Open Domain Question Answering with Complex Reasoning over Table☆22Updated last year
- The code implementation of the EMNLP2022 paper: DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Gene…☆25Updated last year
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆43Updated last year
- ConvGQR: Generative Query Reformulation for Conversational Search. A codebase for ACL 2023 accepted paper.☆27Updated 10 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆55Updated 6 months ago
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆41Updated last month
- Code and Data Repo for [ACL 2023] Paper "Element-aware Summary and Summary Chain-of-Thought (SumCoT)"☆54Updated 11 months ago
- ☆85Updated last year
- Code and data for "MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models"☆32Updated 3 months ago
- ☆70Updated 11 months ago
- ☆60Updated 2 years ago
- ☆57Updated 3 weeks ago
- GPT as Human☆18Updated last month
- Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning☆99Updated last year
- Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)☆45Updated 3 months ago
- Code and data for the FACTOR paper☆44Updated last year
- Contrastive Chain-of-Thought Prompting☆57Updated last year
- Official codebase for permutation self-consistency.☆16Updated 11 months ago
- official repository for ListT5☆41Updated 2 weeks ago
- Code and models for the paper "Questions Are All You Need to Train a Dense Passage Retriever (TACL 2023)"☆62Updated 2 years ago