successar / Eraser-Benchmark-Baseline-ModelsLinks
Baseline for ERASER benchmark
☆17Updated 2 years ago
Alternatives and similar repositories for Eraser-Benchmark-Baseline-Models
Users that are interested in Eraser-Benchmark-Baseline-Models are comparing it to the libraries listed below
Sorting:
- Code and datasets for the EMNLP 2020 paper "Calibration of Pre-trained Transformers"☆61Updated 2 years ago
- Code for Repl4NLP paper "A Cross-Task Analysis of Text Span Representations"☆21Updated 2 years ago
- Implementation for https://arxiv.org/abs/2005.00652☆28Updated 2 years ago
- Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"☆22Updated 4 years ago
- ☆24Updated 4 years ago
- ☆46Updated 2 years ago
- Code repo for EMNLP 2019 WIQA dataset paper☆13Updated 2 years ago
- ☆28Updated 2 years ago
- ☆27Updated 2 years ago
- ☆17Updated 5 years ago
- Repository containing code for the NAACL 2021 paper (Incorporating External Knowledge to Enhance Tabular Reasoning)☆17Updated 4 years ago
- NILE : Natural Language Inference with Faithful Natural Language Explanations☆30Updated 2 years ago
- ☆20Updated 3 years ago
- A unified approach to explain conditional text generation models. Pytorch. The code of paper "Local Explanation of Dialogue Response Gene…☆16Updated 3 years ago
- SP-10K is a large-scale human-annotated selectional preference set. Five selectional preference relations are included.☆12Updated 5 years ago
- A benchmark for understanding and evaluating rationales: http://www.eraserbenchmark.com/☆98Updated 2 years ago
- Framework for testing models with AI2 leaderboards☆21Updated last year
- Code for paper "Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?"☆22Updated 5 years ago
- ☆13Updated 5 years ago
- Codebase for running (conditional) probing experiments☆21Updated 2 years ago
- ☆63Updated 5 years ago
- Pytorch implementation of DiffMask☆57Updated 2 years ago
- Debiasing Methods in Natural Language Understanding Make Bias More Accessible: Code and Data☆14Updated 3 years ago
- ☆49Updated 2 years ago
- Data and code for "A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization" (ACL 2020)☆49Updated 2 years ago
- Tools and datasets for Aristo Leaderboards☆42Updated 4 years ago
- Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"☆25Updated 2 years ago
- This repository contains the code for "How many data points is a prompt worth?"☆48Updated 4 years ago
- NABERT model for solving the DROP dataset☆26Updated 6 years ago
- FaiRR: Faithful and Robust Deductive Reasoning over Natural Language (ACL 2022)☆14Updated 3 years ago