Natural Questions (NQ) contains real user questions issued to Google search, and answers found from Wikipedia by annotators. NQ is designed for the training and evaluation of automatic question answering systems.
☆1,121Jul 30, 2021Updated 4 years ago
Alternatives and similar repositories for natural-questions
Users that are interested in natural-questions are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Shared repository for open-sourced projects from the Google AI Language team.☆1,775May 19, 2026Updated last week
- Dense Passage Retriever - is a set of tools and models for open domain Q&A task.☆1,866Apr 6, 2023Updated 3 years ago
- TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and …☆319May 28, 2020Updated 6 years ago
- ☆31Jun 19, 2020Updated 5 years ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆211Aug 31, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Library for Knowledge Intensive Language Tasks☆973Mar 31, 2022Updated 4 years ago
- Resources for the MRQA 2019 Shared Task☆293Aug 5, 2021Updated 4 years ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆123Apr 23, 2022Updated 4 years ago
- Datasets for Question Answering by Search and Reading☆70Jan 19, 2018Updated 8 years ago
- Scripts and links to recreate the ELI5 dataset.☆324Aug 31, 2021Updated 4 years ago
- MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension and question answerin…☆230Jun 12, 2023Updated 2 years ago
- ACL2020 Tutorial: Open-Domain Question Answering☆835Jan 1, 2021Updated 5 years ago
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"☆6,520Jan 14, 2026Updated 4 months ago
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.☆2,198Oct 16, 2025Updated 7 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- New dataset☆310Aug 31, 2021Updated 4 years ago
- Code for the TriviaQA reading comprehension dataset☆337Apr 5, 2024Updated 2 years ago
- Fusion-in-Decoder☆593Oct 4, 2023Updated 2 years ago
- ☆437Feb 4, 2024Updated 2 years ago
- This repository contains the NarrativeQA dataset. It includes the list of documents with Wikipedia summaries, links to full stories, and …☆513Apr 15, 2020Updated 6 years ago
- ☆178May 28, 2019Updated 7 years ago
- Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.☆2,076Updated this week
- XLNet: Generalized Autoregressive Pretraining for Language Understanding☆6,175May 28, 2023Updated 3 years ago
- We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically …☆194Jun 16, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- An original implementation of ACL 2019, "Multi-hop Reading Comprehension through Question Decomposition and Rescoring"☆138Apr 23, 2022Updated 4 years ago
- Authors' implementation of EMNLP-IJCNLP 2019 paper "Answering Complex Open-domain Questions Through Iterative Query Generation"☆195Oct 29, 2019Updated 6 years ago
- Reading Wikipedia to Answer Open-Domain Questions☆4,471Oct 1, 2023Updated 2 years ago
- Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning☆776Apr 7, 2023Updated 3 years ago
- Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granul…☆1,542May 31, 2023Updated 2 years ago
- PyTorch original implementation of Cross-lingual Language Model Pretraining.☆2,930Feb 14, 2023Updated 3 years ago
- ☆589Apr 26, 2021Updated 5 years ago
- This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, an…☆569Jan 4, 2022Updated 4 years ago
- An original implementation of EMNLP 2019, "A Discrete Hard EM Approach for Weakly Supervised Question Answering"☆135Jul 3, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A novel embedding training algorithm leveraging ANN search and achieved SOTA retrieval on Trec DL 2019 and OpenQA benchmarks☆385Jan 6, 2026Updated 4 months ago
- docTTTTTquery document expansion model☆375Mar 25, 2023Updated 3 years ago
- Adversarial Natural Language Inference Benchmark☆399May 12, 2022Updated 4 years ago
- An open-source NLP research library, built on PyTorch.☆11,896Nov 22, 2022Updated 3 years ago
- DeepThought's solution☆79Sep 6, 2023Updated 2 years ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆2,369Mar 23, 2024Updated 2 years ago
- EMNLP 2021 - Pre-training architectures for dense retrieval☆256Mar 18, 2022Updated 4 years ago