krystalan / chatgpt_as_nlg_evaluator
Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study
☆43Updated last year
Alternatives and similar repositories for chatgpt_as_nlg_evaluator:
Users that are interested in chatgpt_as_nlg_evaluator are comparing it to the libraries listed below
- We construct and introduce DIALFACT, a testing benchmark dataset crowd-annotated conversational claims, paired with pieces of evidence fr…☆41Updated 2 years ago
- First explanation metric (diagnostic report) for text generation evaluation☆63Updated 7 months ago
- [EMNLP 2022] Code and data for "Controllable Dialogue Simulation with In-Context Learning"☆34Updated last year
- 🩺 A collection of ChatGPT evaluation reports on various bechmarks.☆48Updated last year
- ☆26Updated 2 years ago
- Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planning☆36Updated last year
- Official code for "Continual Prompt Tuning for Dialog State Tracking" (ACL 2022).☆27Updated last year
- An Interpretable Neuro-Symbolic Framework for Task-Oriented Dialogue Generation☆23Updated 2 years ago
- ACL'23: Unified Demonstration Retriever for In-Context Learning☆36Updated last year
- ☆16Updated 11 months ago
- Code for EMNLP 2021 paper "CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization"☆45Updated 3 years ago
- Dataset, metrics, and models for TACL 2023 paper MACSUM: Controllable Summarization with Mixed Attributes.☆34Updated last year
- The source code of our ACL paper "A Training-free and Reference-free Summarization Evaluation Metric via Centrality-weighted Relevance an…☆14Updated last year
- The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"☆19Updated last year
- ☆30Updated last year
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆39Updated last year
- ☆82Updated last year
- ☆85Updated last year
- The code of Paper "Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question Answering".☆21Updated 2 years ago
- The code implementation of the EMNLP2022 paper: DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Gene…☆25Updated last year
- ☆47Updated 2 years ago
- TBC☆26Updated 2 years ago
- Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"☆41Updated 3 years ago
- EMNLP'2022: BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation☆40Updated 2 years ago
- This code accompanies the paper DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering.☆17Updated last year
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- Code for embedding and retrieval research.☆16Updated last year
- Paradigm shift in natural language processing☆42Updated 2 years ago
- ☆28Updated 11 months ago
- Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning☆99Updated last year