Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".
☆16May 3, 2022Updated 3 years ago
Alternatives and similar repositories for MOCHA
Users that are interested in MOCHA are comparing it to the libraries listed below
Sorting:
- This dataset contains human judgements about answer equivalence. The data is based on SQuAD (Stanford Question Answering Dataset), and co…☆27Oct 24, 2022Updated 3 years ago
- An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"☆16Nov 5, 2024Updated last year
- SuperCLUE高考作文机器自动阅卷系统☆17Jun 8, 2023Updated 2 years ago
- A repository for ACL 2022 paper "How do we answer complex questions: Discourse structure of long form answers"☆19May 31, 2025Updated 9 months ago
- Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://a…☆45Jul 30, 2022Updated 3 years ago
- ☆55Mar 27, 2023Updated 2 years ago
- Official repository of the R2-D2's pipeline☆21Nov 16, 2021Updated 4 years ago
- Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…☆25Nov 4, 2022Updated 3 years ago
- ☆50Feb 5, 2023Updated 3 years ago
- ☆29Dec 2, 2024Updated last year
- Code for EMNLP2020 paper: "Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space"☆26May 10, 2021Updated 4 years ago
- ☆33Jun 12, 2023Updated 2 years ago
- ARCHIVED. Please use https://docs.adapterhub.ml/huggingface_hub.html || 🔌 A central repository collecting pre-trained adapter modules☆69May 26, 2024Updated last year
- The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".☆28Jun 19, 2021Updated 4 years ago
- ☆13Oct 5, 2025Updated 4 months ago
- Attribution-based Parameter Decomposition☆34Jun 11, 2025Updated 8 months ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆121Apr 23, 2022Updated 3 years ago
- resources for the IBM Airlines Table-Question-Answering Benchmark☆33Jul 11, 2022Updated 3 years ago
- Dataset for protoqa ("family feud") data☆34Dec 2, 2021Updated 4 years ago
- ☆31Jun 19, 2020Updated 5 years ago
- 🪝PISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models☆12May 30, 2025Updated 9 months ago
- Training code for Sparse Autoencoders on Embedding models☆39Feb 27, 2025Updated last year
- ☆75Jul 2, 2021Updated 4 years ago
- ☆12Jan 11, 2026Updated last month
- ☆12Feb 22, 2021Updated 5 years ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- Super Flappy Bird in p5.js☆10Mar 8, 2021Updated 4 years ago
- ChatGPT CSS style☆14Apr 28, 2024Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- A framework for evaluating Machine Translation models.☆12May 26, 2025Updated 9 months ago
- ☆14Apr 29, 2025Updated 10 months ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- ☆85Updated this week
- ☆38Jun 3, 2021Updated 4 years ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Aug 10, 2024Updated last year
- EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation☆97Mar 20, 2023Updated 2 years ago
- LLM red teaming datasets from the paper 'Student-Teacher Prompting for Red Teaming to Improve Guardrails' for the ART of Safety Workshop …☆22Oct 12, 2023Updated 2 years ago
- ☆12Nov 5, 2024Updated last year
- ☆10Mar 19, 2024Updated last year