ntunlp / Evaluation-of-ChatGPTLinks
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets.
☆15Updated 2 years ago
Alternatives and similar repositories for Evaluation-of-ChatGPT
Users that are interested in Evaluation-of-ChatGPT are comparing it to the libraries listed below
Sorting:
- First explanation metric (diagnostic report) for text generation evaluation☆62Updated 4 months ago
- ☆43Updated last year
- On Transferability of Prompt Tuning for Natural Language Processing☆99Updated last year
- Codes and Datasets for our ACL 2023 paper on cognitive reframing of negative thoughts☆63Updated last year
- Code and data associated with the AmbiEnt dataset in "We're Afraid Language Models Aren't Modeling Ambiguity" (Liu et al., 2023)☆64Updated last year
- ☆44Updated 10 months ago
- Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Mod…☆37Updated last year
- Code and data for the FACTOR paper☆48Updated last year
- Evaluate the Quality of Critique☆36Updated last year
- Repo for "On Learning to Summarize with Large Language Models as References"☆44Updated 2 years ago
- Code and Resources for the paper, "Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries"☆16Updated last year
- Detect hallucinated tokens for conditional sequence generation.☆64Updated 3 years ago
- ☆53Updated 3 years ago
- Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".☆81Updated last month
- ☆21Updated 2 years ago
- Token-level Reference-free Hallucination Detection☆94Updated last year
- ⚡Research papers about leveraging the capabilities of language models⚡☆52Updated 2 years ago
- ☆39Updated 2 years ago
- A curated list of research papers and resources on Cultural LLM.☆45Updated 9 months ago
- Apps built using Inspired Cognition's Critique.☆58Updated 2 years ago
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)☆62Updated 3 years ago
- ☆87Updated 2 years ago
- Resources for cultural NLP research☆98Updated 2 months ago
- Interpretable unified language safety checking with large language models☆31Updated 2 years ago
- ☆48Updated last year
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆43Updated 2 years ago
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆40Updated 2 years ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆86Updated 11 months ago
- This respository contains the code for extracting the test samples we used in our paper: "A Multitask, Multilingual, Multimodal Evaluatio…☆77Updated last year
- [ACL 2023] Code and Data Repo for Paper "Element-aware Summary and Summary Chain-of-Thought (SumCoT)"☆54Updated last year