google-research-datasets/tydiqa

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/google-research-datasets/tydiqa)

google-research-datasets / tydiqa

TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and without the use of translation, and is designed for the training and evaluation of automatic question answering systems. This repository provides evaluation code and a baseline system for the dataset.

☆319

Alternatives and similar repositories for tydiqa

Users that are interested in tydiqa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / MLQA
View on GitHub
New dataset
☆312Aug 31, 2021Updated 4 years ago
AkariAsai / XORQA
View on GitHub
This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".
☆80Jun 3, 2021Updated 5 years ago
facebookresearch / reconsider
View on GitHub
ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhi…
☆50Apr 26, 2021Updated 5 years ago
thunlp / XQA
View on GitHub
Dataset and baseline for ACL 2019 paper "XQA: A Cross-lingual Open-domain Question Answering Dataset"
☆89Nov 16, 2021Updated 4 years ago
google-deepmind / xquad
View on GitHub
☆210Nov 12, 2021Updated 4 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
google-research / xtreme
View on GitHub
XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 ty…
☆651Jan 4, 2023Updated 3 years ago
google-research-datasets / natural-questions
View on GitHub
Natural Questions (NQ) contains real user questions issued to Google search, and answers found from Wikipedia by annotators. NQ is design…
☆1,133Jul 30, 2021Updated 4 years ago
xwhan / ProQA
View on GitHub
Progressively Pretrained Dense Corpus Index for Open-Domain QA and Information Retrieval
☆43Jun 12, 2023Updated 3 years ago
castorini / mr.tydi
View on GitHub
Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.
☆83Feb 16, 2022Updated 4 years ago
google-research-datasets / answer-equivalence-dataset
View on GitHub
This dataset contains human judgements about answer equivalence. The data is based on SQuAD (Stanford Question Answering Dataset), and co…
☆30Oct 24, 2022Updated 3 years ago
google-research-datasets / MultiReQA
View on GitHub
We are creating a challenging new benchmark MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models. Retrieval quest…
☆31Jul 9, 2020Updated 6 years ago
apple / ml-mkqa
View on GitHub
We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically …
☆193Jun 16, 2022Updated 4 years ago
AkariAsai / unanswerable_qa
View on GitHub
The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".
☆28Jun 19, 2021Updated 5 years ago
google-research-datasets / paws
View on GitHub
This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, an…
☆570Jan 4, 2022Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
google-research-datasets / QED
View on GitHub
QED: A Framework and Dataset for Explanations in Question Answering
☆119Aug 3, 2021Updated 4 years ago
cisnlp / simalign
View on GitHub
[EMNLP 2020] Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)
☆398Nov 7, 2023Updated 2 years ago
facebookresearch / KILT
View on GitHub
Library for Knowledge Intensive Language Tasks
☆978Mar 31, 2022Updated 4 years ago
studio-ousia / bpr
View on GitHub
Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering
☆175Jun 6, 2021Updated 5 years ago
alexa / dialoglue
View on GitHub
DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue
☆288Jul 6, 2023Updated 3 years ago
google-research / multilingual-t5
View on GitHub
☆1,294Dec 15, 2022Updated 3 years ago
google-research / electra
View on GitHub
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
☆2,368Mar 23, 2024Updated 2 years ago
facebookresearch / DPR
View on GitHub
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
☆1,868Apr 6, 2023Updated 3 years ago
shmsw25 / AmbigQA
View on GitHub
An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"
☆123Apr 23, 2022Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
allenai / unifiedqa
View on GitHub
UnifiedQA: Crossing Format Boundaries With a Single QA System
☆442May 9, 2022Updated 4 years ago
AkariAsai / CORA
View on GitHub
This is the official implementation of NeurIPS 2021 "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Ret…
☆71Apr 1, 2022Updated 4 years ago
deepset-ai / FARM
View on GitHub
Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
☆1,751Dec 20, 2023Updated 2 years ago
shmsw25 / bart-closed-book-qa
View on GitHub
A BART version of an open-domain QA model in a closed-book setup
☆118Aug 13, 2020Updated 5 years ago
bloomsburyai / question-generation
View on GitHub
Neural text-to-text question generation
☆213Nov 13, 2020Updated 5 years ago
muhaochen / bilingual_dictionaries
View on GitHub
This repository contains the source code and links to some datasets used in the CoNLL 2019 paper "Learning to Represent Bilingual Diction…
☆12Oct 1, 2020Updated 5 years ago
google-research / language
View on GitHub
Shared repository for open-sourced projects from the Google AI Language team.
☆1,787Jun 10, 2026Updated last month
AkariAsai / learning_to_retrieve_reasoning_paths
View on GitHub
The official implementation of ICLR 2020, "Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering".
☆436Jul 25, 2024Updated last year
ccasimiro88 / TranslateAlignRetrieve
View on GitHub
Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.
☆59Dec 8, 2022Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
AkariAsai / extractive_rc_by_runtime_mt
View on GitHub
Code and datasets of "Multilingual Extractive Reading Comprehension by Runtime Machine Translation"
☆40Jan 2, 2019Updated 7 years ago
facebookresearch / LAMA
View on GitHub
LAnguage Model Analysis
☆1,391Jul 7, 2024Updated 2 years ago
facebookresearch / anli
View on GitHub
Adversarial Natural Language Inference Benchmark
☆402May 12, 2022Updated 4 years ago
soco-ai / SF-QA
View on GitHub
Evaluation framework for open-domain question answering.
☆20May 16, 2021Updated 5 years ago
allenai / dont-stop-pretraining
View on GitHub
Code associated with the Don't Stop Pretraining ACL 2020 paper
☆543Nov 15, 2021Updated 4 years ago
facebookresearch / MLDoc
View on GitHub
A Corpus for Multilingual Document Classification in Eight Languages.
☆153Jun 6, 2022Updated 4 years ago
gauthierdmn / question_generation
View on GitHub
Neural Question Generation using the SQuAD and NewsQA datasets
☆109Dec 8, 2022Updated 3 years ago