acl-org / arr-health
Monitoring the health of ARR
☆22Updated last week
Alternatives and similar repositories for arr-health:
Users that are interested in arr-health are comparing it to the libraries listed below
- ☆44Updated last year
- Dataset for Unified Editing, EMNLP 2023. This is a model editing dataset where edits are natural language phrases.☆23Updated 7 months ago
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper he…☆23Updated last month
- Code for Aesop: Paraphrase Generation with Adaptive Syntactic Control (EMNLP 2021)☆27Updated 3 years ago
- Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"☆22Updated 3 years ago
- ☆34Updated 3 years ago
- This code accompanies the paper DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering.☆17Updated 2 years ago
- ☆33Updated 2 years ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year
- The official repository for Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapte…☆17Updated last year
- ☆99Updated 2 years ago
- ☆58Updated 2 years ago
- ☆15Updated 3 years ago
- Official codebase for “In-Context Learning with Many Demonstration Examples”☆16Updated 2 years ago
- ☆75Updated last year
- ☆48Updated 2 years ago
- ☆13Updated 2 years ago
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- Code for the paper "Attention Temperature Matters in Abstractive Summarization Distillation"(https://arxiv.org/abs/2106.03441)☆13Updated 3 years ago
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆58Updated last year
- ☆86Updated last year
- NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks☆20Updated 2 years ago
- Code for paper "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation" EMNLP 2021 and "…☆18Updated 3 years ago
- ☆82Updated 2 years ago
- The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"☆20Updated last year
- Codes for the EMNLP2021 paper: Benchmarking Commonsense Knowledge Base Population (https://aclanthology.org/2021.emnlp-main.705.pdf). An …☆26Updated last year
- ☆17Updated last year
- ☆20Updated 2 years ago
- The Dataset and Official Implementation for <The ELCo Dataset: Bridging Emoji and Lexical Composition> @ LREC-COLING 2024☆12Updated 11 months ago
- Constrained Decoding Project☆17Updated last year