alexa / massive
Tools and Modeling Code for the MASSIVE dataset
☆538Updated last year
Related projects ⓘ
Alternatives and complementary repositories for massive
- Web-scale retrieval for knowledge-intensive NLP☆555Updated last year
- Multi-angle c(q)uestion answering☆458Updated 2 years ago
- ☆487Updated 9 months ago
- Library for Knowledge Intensive Language Tasks☆916Updated 2 years ago
- Adversarial Natural Language Inference Benchmark☆389Updated 2 years ago
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆251Updated last month
- Scripts and links to recreate the ELI5 dataset.☆318Updated 3 years ago
- A library to synthesize text datasets using Large Language Models (LLM)☆151Updated last year
- XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 ty…☆631Updated last year
- CoVoST: A Large-Scale Multilingual Speech-To-Text Translation Corpus (CC0 Licensed)☆352Updated 3 years ago
- Stanford's Alexa Prize socialbot☆131Updated last year
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆351Updated last year
- UnifiedQA: Crossing Format Boundaries With a Single QA System☆428Updated 2 years ago
- Repository containing code for "How to Train BERT with an Academic Budget" paper☆309Updated last year
- Conversational text Analysis using various NLP techniques☆178Updated last year
- NeuSpell: A Neural Spelling Correction Toolkit☆673Updated last year
- An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)☆443Updated 3 weeks ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆202Updated 3 years ago
- TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and …☆292Updated 4 years ago
- DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue☆281Updated last year
- ☆194Updated this week
- DialogSum: A Real-life Scenario Dialogue Summarization Dataset - Findings of ACL 2021☆172Updated last year
- A dataset containing human-human knowledge-grounded open-domain conversations.☆632Updated 3 months ago
- ☆363Updated last week
- A suite of tools for managing crowdsourcing tasks from the inception through to data packaging for research use.☆305Updated this week
- ☆1,252Updated last year
- Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)☆457Updated 2 years ago
- This repository contains the code for "Generating Datasets with Pretrained Language Models".☆187Updated 3 years ago
- Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging F…☆564Updated last year
- Autoregressive Entity Retrieval☆765Updated last year