cybertronai / Megatron-LMLinks
Ongoing research training transformer language models at scale, including: BERT
☆16Updated 6 years ago
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below
Sorting:
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines☆137Updated 2 years ago
- Code from the paper "What do Models Learn from Question Answering Datasets?" (EMNLP 2020)☆55Updated 4 years ago
- Few-shot NLP benchmark for unified, rigorous eval☆92Updated 3 years ago
- Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation☆32Updated 2 years ago
- Temporal Commonsense Reasoning in Dialog☆72Updated 4 years ago
- QED: A Framework and Dataset for Explanations in Question Answering☆117Updated 4 years ago
- A BART version of an open-domain QA model in a closed-book setup☆119Updated 5 years ago
- Cross-lingual GLUE☆49Updated 2 years ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆119Updated 3 years ago
- The official implementation of ACL 2020, "Logic-Guided Data Augmentation and Regularization for Consistent Question Answering".☆71Updated last year
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆156Updated last year
- ☆48Updated 5 years ago
- NAACL'19: "Jointly Optimizing Diversity and Relevance in Neural Response Generation"☆73Updated 4 years ago
- A repository for our AAAI-2020 Cross-lingual-NER paper. Code will be updated shortly.☆47Updated 2 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Updated 4 years ago
- Source code accompanying the KONVENS 2019 paper "Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Em…☆66Updated 5 years ago
- EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation☆98Updated 2 years ago
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16Updated 3 years ago
- ☆40Updated 4 years ago
- Code for experiments on OpenBookQA from the EMNLP 2018 paper "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Quest…☆128Updated 4 years ago
- ☆97Updated 5 years ago
- Code for the CIKM 2019 Paper: How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations☆32Updated 2 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆27Updated 3 years ago
- ☆46Updated 5 years ago
- Code from the paper "Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity"☆19Updated 5 years ago
- Code to support the paper "Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets"☆66Updated 4 years ago
- Repository for the Question Answering via Sentence Composition (QASC) dataset☆56Updated 2 years ago
- Fork of huggingface/pytorch-pretrained-BERT for BERT on STILTs☆106Updated 2 years ago
- Hyperparameter Search for AllenNLP☆139Updated 6 months ago
- Code for the paper "True Few-Shot Learning in Language Models" (https://arxiv.org/abs/2105.11447)☆145Updated 3 years ago