disi-unibo-nlp / nlg-metricverse
[COLING22] An End-to-End Library for Evaluating Natural Language Generation
☆87Updated 10 months ago
Related projects ⓘ
Alternatives and complementary repositories for nlg-metricverse
- ☆80Updated last year
- First explanation metric (diagnostic report) for text generation evaluation☆60Updated 3 months ago
- Lexically constrained text generation with CBART.☆47Updated 2 years ago
- Codes for our paper "CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation" (ACL 2022)☆32Updated 2 years ago
- Code base of In-Context Learning for Dialogue State tracking☆44Updated last year
- ☆44Updated last year
- [ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question Answering☆45Updated 2 years ago
- Repository for ACL'22 paper: Dynamic Latent Extraction for Abstractive Long-Input Summarization☆55Updated last year
- ☆90Updated 7 months ago
- ☆70Updated 3 years ago
- Official Github repo for the paper "Evaluating the Evaluation of Diversity in Natural Language Generation"☆19Updated 3 years ago
- Authors' implementation of the paper Adaptive Information Seeking for Open-Domain Question Answering, published in EMNLP 2021.☆37Updated last year
- Benchmark for evaluating open-ended generation☆44Updated this week
- Dataset for NAACL 2021 paper: "QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization"☆110Updated last year
- Source code for paper on commonsense reasoning for 2020 Annual Conference of the Association for Computational Linguistics (ACL) 2020.☆28Updated 3 months ago
- code associated with ACL 2021 DExperts paper☆113Updated last year
- An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"☆131Updated 2 years ago
- Hierarchical Sketch Induction for Paraphrase Generation (Hosking et al., ACL 2022)☆51Updated last year
- TBC☆26Updated 2 years ago
- FusedChat is a dialogue dataset. It contains dialogue sessions fusing task-oriented dialogues and open-domain dialogues.☆28Updated 2 years ago
- FRANK: Factuality Evaluation Benchmark☆52Updated last year
- Detect hallucinated tokens for conditional sequence generation.☆63Updated 2 years ago
- ☆60Updated last year
- Collection of scripts to pretrain T5 in unsupervised text, using PyTorch Lightning. CORD-19 pretraining provided as example.☆30Updated 3 years ago
- ☆57Updated 2 years ago
- Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning☆96Updated last year
- Data and code for "A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization" (ACL 2020)☆47Updated last year
- Code for the ACL 2022 paper "Contextual Representation Learning beyond Masked Language Modeling"☆33Updated 2 years ago
- ☆25Updated 2 years ago
- Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 20…☆107Updated 2 years ago