dig-team / hanna-benchmark-asgView external linksLinks
HANNA, a large annotated dataset of Human-ANnotated NArratives for ASG evaluation.
☆35Oct 15, 2024Updated last year
Alternatives and similar repositories for hanna-benchmark-asg
Users that are interested in hanna-benchmark-asg are comparing it to the libraries listed below
Sorting:
- Benchmark for evaluating open-ended generation☆50Nov 6, 2024Updated last year
- The official repository for our EMNLP 2024 paper, Themis: A Reference-free NLG Evaluation Language Model with Flexibility and Interpretab…☆21Feb 23, 2025Updated 11 months ago
- ☆34Jan 7, 2026Updated last month
- ☆39Jun 7, 2023Updated 2 years ago
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆47Jan 21, 2025Updated last year
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- Code for ACL 2020 paper: USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation (https://arxiv.org/pdf/2005.0045…☆49Dec 8, 2022Updated 3 years ago
- What are the best Systems? New Perspectives on NLP Benchmarking☆13Mar 16, 2023Updated 2 years ago
- Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification☆16Jan 8, 2024Updated 2 years ago
- NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM☆39Dec 27, 2022Updated 3 years ago
- Codes for our paper "CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation" (ACL 2022)☆33Jun 6, 2022Updated 3 years ago
- [ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"☆42Jul 19, 2024Updated last year
- ☆18Oct 8, 2024Updated last year
- The Gene UI components library designed for BI tools☆17Updated this week
- Personalized Story Evaluation Model☆18Nov 27, 2023Updated 2 years ago
- ☆22Feb 26, 2024Updated last year
- Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation☆215Feb 10, 2024Updated 2 years ago
- BARTScore: Evaluating Generated Text as Text Generation☆367Jun 27, 2022Updated 3 years ago
- Dialogue Planning via Brownian Bridge Stochastic Process for Goal-directed Proactive Dialogue (ACL Findings 2023)☆21Nov 10, 2025Updated 3 months ago
- ☆50Feb 5, 2023Updated 3 years ago
- The source code of the paper 'Dynamic Knowledge Routing Network For Target-Guided Open-Domain Conversation'☆24Mar 24, 2023Updated 2 years ago
- 거꾸로 읽는 SSL 시즌3 - VLM☆20Jun 18, 2023Updated 2 years ago
- Data Valuation on In-Context Examples (ACL23)☆24Jan 12, 2025Updated last year
- ☆59Aug 22, 2024Updated last year
- Resources for the "SummEval: Re-evaluating Summarization Evaluation" paper☆411Jun 23, 2024Updated last year
- ☆32Nov 16, 2021Updated 4 years ago
- 거꾸로 읽는 self-supervised learning in NLP☆27Oct 30, 2022Updated 3 years ago
- Code for SIGdial 2020 paper: Unsupervised Evaluation of Interactive Dialog with DialoGPT (https://arxiv.org/abs/2006.12719)☆28Jun 8, 2020Updated 5 years ago
- ☆71Oct 29, 2021Updated 4 years ago
- DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence☆36Jul 25, 2023Updated 2 years ago
- Evaluate the Quality of Critique☆36Jun 1, 2024Updated last year
- Lean Design System Docs☆10Nov 10, 2021Updated 4 years ago
- An Upgraded Fire Emblem Fates Randomizer☆10Jan 28, 2025Updated last year
- ☆10Nov 8, 2022Updated 3 years ago
- Generative and Parametric design code: featuring Processing / Python / Javascript / HTML / CSS☆14Nov 4, 2020Updated 5 years ago
- ☆34Jul 25, 2024Updated last year
- ☆10Nov 1, 2022Updated 3 years ago
- Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Textual Style Transfer☆36Oct 2, 2022Updated 3 years ago
- ☆144Sep 10, 2023Updated 2 years ago