yooli23 / LEGOEvalLinks
A toolkit for dialogue system evaluation via crowdsourcing
☆18Updated 2 years ago
Alternatives and similar repositories for LEGOEval
Users that are interested in LEGOEval are comparing it to the libraries listed below
Sorting:
- ☆62Updated 3 years ago
- [ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question Answering☆44Updated 3 years ago
- Unified MultiWOZ evaluation scripts for the context-to-response task.☆59Updated 2 years ago
- AAAI 2021: "UBAR: Towards Fully End-to-End Task-Oriented Dialog System with GPT-2"☆97Updated 4 years ago
- Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context, AAAI 2020.☆43Updated last year
- ☆22Updated 4 years ago
- ☆83Updated 2 years ago
- Code for ACL 2020 paper: USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation (https://arxiv.org/pdf/2005.0045…☆50Updated 3 years ago
- ☆53Updated 2 years ago
- The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".☆28Updated 4 years ago
- MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset☆69Updated 3 years ago
- ☆23Updated 3 years ago
- ☆35Updated 2 years ago
- Data from the publication "Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialog…☆24Updated 5 years ago
- Source code of paper “A Novel Three-Stage Learning Framework for Low-Resource Knowledge-Grounded Dialogue Generation”☆16Updated 4 years ago
- The source code of our ACL paper "A Training-free and Reference-free Summarization Evaluation Metric via Centrality-weighted Relevance an…☆14Updated 2 years ago
- Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning☆100Updated 2 years ago
- ☆12Updated 3 years ago
- ☆101Updated 3 years ago
- Zero-shot dialogue state tracking (DST)☆83Updated 4 years ago
- A benchmark dataset for evaluating dialog system and natural language generation metrics.☆39Updated 3 years ago
- Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 20…☆109Updated 3 years ago
- Code for Predictive Engagement: An Efficient Metric for Automatic Evaluation of Open-Domain Dialogue Systems☆16Updated 4 years ago
- ☆25Updated 3 years ago
- [EMNLP 2020] Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading☆38Updated 3 years ago
- Code for ACL2021 paper: "GLGE: A New General Language Generation Evaluation Benchmark"☆57Updated 3 years ago
- The code for ``STYLEDGPT: Stylized Response Generation with Pre-trained LanguageModels'' (Findings of EMNLP2020)☆21Updated 5 years ago
- Authors' implementation of the paper Adaptive Information Seeking for Open-Domain Question Answering, published in EMNLP 2021.☆38Updated 2 years ago
- ☆19Updated 4 years ago
- We construct and introduce DIALFACT, a testing benchmark dataset crowd-annotated conversational claims, paired with pieces of evidence fr…☆44Updated 3 years ago