alexa / massive
Tools and Modeling Code for the MASSIVE dataset
β545Updated 2 years ago
Alternatives and similar repositories for massive:
Users that are interested in massive are comparing it to the libraries listed below
- NL-Augmenter π¦ β π A Collaborative Repository of Natural Language Transformationsβ781Updated 11 months ago
- [ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.oβ¦β604Updated 2 years ago
- TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and β¦β303Updated 4 years ago
- DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogueβ283Updated last year
- β505Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: β¦β333Updated last year
- UnifiedQA: Crossing Format Boundaries With a Single QA Systemβ433Updated 2 years ago
- BLEURT is a metric for Natural Language Generation based on transfer learning.β726Updated last year
- Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processingβ750Updated 6 months ago
- Repository containing code for "How to Train BERT with an Academic Budget" paperβ313Updated last year
- Interpretable Evaluation for AI Systemsβ364Updated 2 years ago
- SpikeX - SpaCy Pipes for Knowledge Extractionβ398Updated 3 years ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"β202Updated 3 years ago
- Scripts and links to recreate the ELI5 dataset.β325Updated 3 years ago
- NeuSpell: A Neural Spelling Correction Toolkitβ692Updated last year
- An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)β445Updated last month
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scaleβ154Updated last year
- Text2Text Language Modeling Toolkitβ300Updated 3 months ago
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences froβ¦β161Updated 7 months ago
- ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: givβ¦β444Updated 7 months ago
- Multi-angle c(q)uestion answeringβ458Updated 2 years ago
- Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processingβ644Updated 2 years ago
- Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models β¦β231Updated 2 years ago
- A Python framework for performing information retrieval experiments, building on http://terrier.org/β452Updated 2 weeks ago
- Resources for the "SummEval: Re-evaluating Summarization Evaluation" paperβ391Updated 10 months ago
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB teβ¦β272Updated 3 months ago
- Collection of papers and resources for data augmentation for NLP.β827Updated 2 years ago
- Stanford's Alexa Prize socialbotβ133Updated last year
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)β361Updated last year
- DialogSum: A Real-life Scenario Dialogue Summarization Dataset - Findings of ACL 2021β177Updated 4 months ago