cybertronai / Megatron-LMLinks
Ongoing research training transformer language models at scale, including: BERT
☆16Updated 6 years ago
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below
Sorting:
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines☆138Updated 2 years ago
- QED: A Framework and Dataset for Explanations in Question Answering☆119Updated 4 years ago
- Code for the CIKM 2019 Paper: How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations☆32Updated 2 years ago
- A BART version of an open-domain QA model in a closed-book setup☆119Updated 5 years ago
- Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation☆34Updated 2 years ago
- Temporal Commonsense Reasoning in Dialog☆72Updated 4 years ago
- Code from the paper "What do Models Learn from Question Answering Datasets?" (EMNLP 2020)☆54Updated 5 years ago
- Assessing syntactic abilities of BERT☆40Updated 6 years ago
- Cross-lingual GLUE☆49Updated 2 years ago
- ☆78Updated last year
- Few-shot NLP benchmark for unified, rigorous eval☆93Updated 3 years ago
- Viewer for the 🤗 datasets library.☆86Updated 4 years ago
- Data & Code for ACCENTOR: "Adding Chit-Chat to Enhance Task-Oriented Dialogues" (NAACL 2021)☆72Updated 4 years ago
- Training T5 to perform numerical reasoning.☆23Updated 4 years ago
- ☆68Updated 9 months ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆120Updated 3 years ago
- ☆39Updated 3 years ago
- NAACL'19: "Jointly Optimizing Diversity and Relevance in Neural Response Generation"☆73Updated 5 years ago
- Fine-tune transformers with pytorch-lightning☆44Updated 3 years ago
- ☆40Updated 5 years ago
- Helper scripts and notes that were used while porting various nlp models☆49Updated 3 years ago
- EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation☆97Updated 2 years ago
- A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.☆80Updated 4 years ago
- ☆97Updated 3 years ago
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆157Updated 2 years ago
- A repository for our AAAI-2020 Cross-lingual-NER paper. Code will be updated shortly.☆47Updated 3 years ago
- Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper☆51Updated 2 years ago
- This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".☆64Updated 5 years ago
- ☆48Updated 5 years ago
- Hyperparameter Search for AllenNLP☆140Updated 11 months ago