Spico197 / awesome-lm-evaluationLinks

🩺 A collection of ChatGPT evaluation reports on various bechmarks.

☆49

Alternatives and similar repositories for awesome-lm-evaluation

Users that are interested in awesome-lm-evaluation are comparing it to the libraries listed below

Sorting:

krystalan / chatgpt_as_nlg_evaluator
Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study
☆43Updated 2 years ago
Spico197 / writing-comrade
✒️ ChatGPT as a writing partner.
☆14Updated 2 years ago
siyuyuan / coscript
Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planning
☆36Updated last year
nuaa-nlp / Evaluation-of-ChatGPT
☆14Updated 2 years ago
xiami2019 / CLAIF
[Findings of ACL'2023] Improving Contrastive Learning of Sentence Embeddings from AI Feedback
☆39Updated last year
Zeng-WH / Prompt-Tuning
Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"
☆57Updated 3 years ago
ChiyuSONG / dynamics-of-instruction-tuning
☆17Updated 5 months ago
izhx / uni-rep
Code for embedding and retrieval research.
☆17Updated last year
littlehacker26 / Discriminator-Cooperative-Unlikelihood-Prompt-Tuning
The code implementation of the EMNLP2022 paper: DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Gene…
☆26Updated last year
ExpressAI / reStructured-Pretraining
reStructured Pre-training
☆98Updated 2 years ago
liyongqi67 / MMCoQA
☆32Updated last year
OhadRubin / EPR
☆63Updated 2 years ago
txsun1997 / nlp-paradigm-shift
Paradigm shift in natural language processing
☆42Updated 3 years ago
Wangpeiyi9979 / ESD
Code for NAACL2022 Long Paper "An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling"
☆28Updated 2 years ago
xu1998hz / InstructScore_SEScore3
First explanation metric (diagnostic report) for text generation evaluation
☆62Updated 5 months ago
yizhen20133868 / CI-ToD
PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialog…
☆26Updated 3 years ago
RUCKBReasoning / GLM-Dialog
☆59Updated 2 years ago
wzhouad / context-faithful-llm
Code and data for paper "Context-faithful Prompting for Large Language Models".
☆41Updated 2 years ago
princeton-nlp / c-sts
[EMNLP 2023] C-STS: Conditional Semantic Textual Similarity
☆73Updated last year
Shark-NLP / CoNT
[NeurIPS'22 Spotlight] Data and code for our paper CoNT: Contrastive Neural Text Generation
☆154Updated 2 years ago
HappyGu0524 / MultiControl
☆42Updated last year
dqxiu / PLMs-with-Knowledge
☆16Updated 3 years ago
HillZhang1999 / RobustGEC
Code & Data for our Paper "RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation" (EMNLP 2023)
☆17Updated last year
prakharguptaz / Instructdial
Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
☆100Updated 2 years ago
dqxiu / CaliNet
☆33Updated 2 years ago
DAMO-NLP-SG / TempReason
☆34Updated last year
qinyiwei / InfoBench
☆55Updated 11 months ago
lgw863 / LogiQA-dataset
☆138Updated 4 years ago
THU-KEG / KoLA
[ICLR24] The open-source repo of THU-KEG's KoLA benchmark.
☆51Updated last year
TsinghuaAI / CUGE
☆53Updated 3 years ago