Spico197 / awesome-lm-evaluation
🩺 A collection of ChatGPT evaluation reports on various bechmarks.
☆48Updated last year
Alternatives and similar repositories for awesome-lm-evaluation:
Users that are interested in awesome-lm-evaluation are comparing it to the libraries listed below
- PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialog…☆27Updated 3 years ago
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆43Updated last year
- The code implementation of the EMNLP2022 paper: DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Gene…☆25Updated last year
- Code & Data for our Paper "RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation" (EMNLP 2023)☆17Updated last year
- ☆70Updated 2 years ago
- Source code for ACL 2022 Paper "Prompt-based Data Augmentation for Low-Resource NLU Tasks"☆67Updated last year
- ☆16Updated 11 months ago
- code for Teaching LM to Translate with Comparison☆39Updated last year
- First explanation metric (diagnostic report) for text generation evaluation☆63Updated 7 months ago
- Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planning☆36Updated last year
- ✒️ ChatGPT as a writing partner.☆14Updated last year
- Official code for "Continual Prompt Tuning for Dialog State Tracking" (ACL 2022).☆27Updated last year
- ☆14Updated last year
- [ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question Answering☆45Updated 2 years ago
- Calculate the probability of a paper being accepted by EMNLP2023 based on score distribution of ACL2023.☆14Updated last year
- OMGEval😮: An Open Multilingual Generative Evaluation Benchmark for Foundation Models☆32Updated 6 months ago
- Paradigm shift in natural language processing☆42Updated 2 years ago
- EMNLP 2022: ClidSum: A Benchmark Dataset for Cross-Lingual Dialogue Summarization☆35Updated last year
- Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)☆46Updated 9 months ago
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉☆12Updated last year
- UNISUMM: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning☆60Updated last year
- 本文旨在整理文本生成领域国内外工业界和企业家的研究者和研究机构。排名不分先后。更新中,欢迎大家补充☆48Updated 4 years ago
- Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning☆99Updated last year
- Code for KE-Blender, EMNLP 2021☆19Updated 2 years ago
- Code base of In-Context Learning for Dialogue State tracking☆45Updated last year
- Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based L…☆17Updated last year
- [Findings of ACL'2023] Improving Contrastive Learning of Sentence Embeddings from AI Feedback☆39Updated last year
- The repository for ACL 2022 paper: Other Roles Matter! Enhancing Role-Oriented Dialogue Summarization via Role Interactions☆26Updated 2 years ago
- This project maintains a reading list for general text generation tasks☆65Updated 3 years ago
- ☆30Updated last year