google-research-datasets / tydiqa
TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and without the use of translation, and is designed for the training and evaluation of automatic question answering systems. This repository provides evaluation code and a baseline system for the dataset.
☆299Updated 4 years ago
Alternatives and similar repositories for tydiqa:
Users that are interested in tydiqa are comparing it to the libraries listed below
- New dataset☆300Updated 3 years ago
- ☆186Updated 3 years ago
- Interpretable Evaluation for (Almost) All NLP Tasks☆195Updated 2 years ago
- Topic-Aware Convolutional Neural Networks for Extreme Summarization☆355Updated last year
- ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: giv…☆438Updated 4 months ago
- DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue☆283Updated last year
- Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive…☆428Updated last year
- Scripts and links to recreate the ELI5 dataset.☆320Updated 3 years ago
- Officially supported AllenNLP models☆534Updated 2 years ago
- BERT for Coreference Resolution☆446Updated 2 years ago
- KnowBert -- Knowledge Enhanced Contextual Word Representations☆373Updated 4 years ago
- The official tool for creating proceedings for conferences of the Association for Computational Linguistics (ACL).☆220Updated 2 weeks ago
- Resources for the "SummEval: Re-evaluating Summarization Evaluation" paper☆382Updated 7 months ago
- We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically …☆171Updated 2 years ago
- This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, an…☆555Updated 3 years ago
- Resources for the NAACL 2018 paper "A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents"☆364Updated last year
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆202Updated 3 years ago
- Code to reproduce the experiments from the paper.☆101Updated last year
- Full Python implementation of the ROUGE metric, producing same results as in the official perl implementation.☆157Updated 5 years ago
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆221Updated 2 years ago
- An elaborate and exhaustive paper list for Named Entity Recognition (NER)☆394Updated 2 years ago
- Unsupervised Question answering via Cloze Translation☆219Updated 2 years ago
- Dataset for NAACL 2021 paper: "DART: Open-Domain Structured Data Record to Text Generation"☆148Updated 2 years ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆354Updated last year
- EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"☆336Updated 2 months ago
- Neural Question Generation using the SQuAD and NewsQA datasets☆109Updated 2 years ago
- Easier Automatic Sentence Simplification Evaluation☆160Updated last year
- A Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)☆133Updated 11 months ago
- Please see the readme file as well as our 2019 EMNLP paper linked here -->☆198Updated 9 months ago
- Adversarial Natural Language Inference Benchmark☆393Updated 2 years ago