ntunlp / Evaluation-of-ChatGPTLinks
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets.
β15Updated 2 years ago
Alternatives and similar repositories for Evaluation-of-ChatGPT
Users that are interested in Evaluation-of-ChatGPT are comparing it to the libraries listed below
Sorting:
- π©Ί A collection of ChatGPT evaluation reports on various bechmarks.β50Updated 2 years ago
- β88Updated 2 years ago
- Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"β75Updated 3 years ago
- Token-level Reference-free Hallucination Detectionβ98Updated 2 years ago
- β187Updated 7 months ago
- First explanation metric (diagnostic report) for text generation evaluationβ62Updated 11 months ago
- Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generationβ213Updated last year
- paper list on reasoning in NLPβ195Updated 9 months ago
- [NeurIPS'22 Spotlight] Data and code for our paper CoNT: Contrastive Neural Text Generationβ152Updated 2 years ago
- β177Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)β63Updated 2 years ago
- β83Updated 2 years ago
- Detect hallucinated tokens for conditional sequence generation.β64Updated 3 years ago
- The data and the PyTorch implementation for the models and experiments in the paper "Exploiting Asymmetry for Synthetic Training Data Genβ¦β64Updated 2 years ago
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Studyβ43Updated 2 years ago
- This project maintains a reading list for general text generation tasksβ66Updated 4 years ago
- https://acl2023-retrieval-lm.github.io/β156Updated 2 years ago
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)β62Updated 3 years ago
- The official code of TACL 2021, "Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies".β83Updated 3 years ago
- Code, datasets, and checkpoints for the paper "Improving Passage Retrieval with Zero-Shot Question Generation (EMNLP 2022)"β100Updated 3 years ago
- Code for the paper "Open Domain Question Answering with A Unified Knowledge Interface" (ACL 2022)β55Updated 2 years ago
- Code and data for paper "Context-faithful Prompting for Large Language Models".β42Updated 2 years ago
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questionsβ118Updated last year
- β84Updated this week
- β43Updated last year
- Dataset for TACL 2022 paper: "FeTaQA: Free-form Table Question Answering"β87Updated 2 years ago
- EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443β86Updated last year
- [Neurips2023] Source code for Lift Yourself Up: Retrieval-augmented Text Generation with Self Memoryβ62Updated 2 years ago
- β47Updated 4 months ago
- β52Updated 2 years ago