TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and without the use of translation, and is designed for the training and evaluation of automatic question answering systems. This repository provides evaluation code and a baseline system for the dataset.
☆319May 28, 2020Updated 6 years ago
Alternatives and similar repositories for tydiqa
Users that are interested in tydiqa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- New dataset☆310Aug 31, 2021Updated 4 years ago
- This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".☆80Jun 3, 2021Updated 5 years ago
- ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhi…☆49Apr 26, 2021Updated 5 years ago
- Dataset and baseline for ACL 2019 paper "XQA: A Cross-lingual Open-domain Question Answering Dataset"☆89Nov 16, 2021Updated 4 years ago
- ☆209Nov 12, 2021Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Natural Questions (NQ) contains real user questions issued to Google search, and answers found from Wikipedia by annotators. NQ is design…☆1,123Jul 30, 2021Updated 4 years ago
- XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 ty…☆652Jan 4, 2023Updated 3 years ago
- Progressively Pretrained Dense Corpus Index for Open-Domain QA and Information Retrieval☆43Jun 12, 2023Updated 2 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆82Feb 16, 2022Updated 4 years ago
- This dataset contains human judgements about answer equivalence. The data is based on SQuAD (Stanford Question Answering Dataset), and co…☆28Oct 24, 2022Updated 3 years ago
- We are creating a challenging new benchmark MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models. Retrieval quest…☆31Jul 9, 2020Updated 5 years ago
- The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".☆28Jun 19, 2021Updated 4 years ago
- We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically …☆196Jun 16, 2022Updated 3 years ago
- This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, an…☆569Jan 4, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- QED: A Framework and Dataset for Explanations in Question Answering☆119Aug 3, 2021Updated 4 years ago
- [EMNLP 2020] Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆397Nov 7, 2023Updated 2 years ago
- Library for Knowledge Intensive Language Tasks☆974Mar 31, 2022Updated 4 years ago
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆174Jun 6, 2021Updated 5 years ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆2,368Mar 23, 2024Updated 2 years ago
- ☆1,294Dec 15, 2022Updated 3 years ago
- DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue☆287Jul 6, 2023Updated 2 years ago
- Dense Passage Retriever - is a set of tools and models for open domain Q&A task.☆1,866Apr 6, 2023Updated 3 years ago
- Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.☆1,753Dec 20, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- This is the official implementation of NeurIPS 2021 "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Ret…☆71Apr 1, 2022Updated 4 years ago
- UnifiedQA: Crossing Format Boundaries With a Single QA System☆445May 9, 2022Updated 4 years ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆123Apr 23, 2022Updated 4 years ago
- A BART version of an open-domain QA model in a closed-book setup☆119Aug 13, 2020Updated 5 years ago
- Shared repository for open-sourced projects from the Google AI Language team.☆1,779May 19, 2026Updated 3 weeks ago
- Neural text-to-text question generation☆216Nov 13, 2020Updated 5 years ago
- This repository contains the source code and links to some datasets used in the CoNLL 2019 paper "Learning to Represent Bilingual Diction…☆12Oct 1, 2020Updated 5 years ago
- The official implementation of ICLR 2020, "Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering".☆437Jul 25, 2024Updated last year
- LAnguage Model Analysis☆1,387Jul 7, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code and datasets of "Multilingual Extractive Reading Comprehension by Runtime Machine Translation"☆40Jan 2, 2019Updated 7 years ago
- Code associated with the Don't Stop Pretraining ACL 2020 paper☆543Nov 15, 2021Updated 4 years ago
- ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: giv…☆466Sep 11, 2024Updated last year
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Dec 8, 2022Updated 3 years ago
- Evaluation framework for open-domain question answering.☆20May 16, 2021Updated 5 years ago
- A Corpus for Multilingual Document Classification in Eight Languages.☆153Jun 6, 2022Updated 4 years ago
- BLEURT is a metric for Natural Language Generation based on transfer learning.☆792Aug 4, 2023Updated 2 years ago