Spico197 / awesome-lm-evaluation
π©Ί A collection of ChatGPT evaluation reports on various bechmarks.
β48Updated last year
Alternatives and similar repositories for awesome-lm-evaluation:
Users that are interested in awesome-lm-evaluation are comparing it to the libraries listed below
- PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogβ¦β27Updated 3 years ago
- The code implementation of the EMNLP2022 paper: DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Geneβ¦β25Updated last year
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Studyβ43Updated 2 years ago
- β70Updated 2 years ago
- βοΈ ChatGPT as a writing partner.β14Updated 2 years ago
- First explanation metric (diagnostic report) for text generation evaluationβ62Updated 2 weeks ago
- Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planningβ36Updated last year
- Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"β57Updated 2 years ago
- Code for M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Modelsβ22Updated 7 months ago
- β31Updated last year
- Code & Data for our Paper "RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation" (EMNLP 2023)β17Updated last year
- Calculate the probability of a paper being accepted by EMNLP2023 based on score distribution of ACL2023.β14Updated last year
- code for Teaching LM to Translate with Comparisonβ39Updated last year