ngoyal2707 / Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆18Updated 2 years ago
Alternatives and similar repositories for Megatron-LM:
Users that are interested in Megatron-LM are comparing it to the libraries listed below
- Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)☆10Updated 3 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Compute-optimal LLMs☆11Updated 2 years ago
- ☆11Updated 2 years ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated last year
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated last year
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆14Updated last year
- Open source library for few shot NLP☆77Updated last year
- A package for fine tuning of pretrained NLP transformers using Semi Supervised Learning☆15Updated 3 years ago
- This repo contains data and code for the paper "Reasoning over Public and Private Data in Retrieval-Based Systems."☆46Updated 8 months ago
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆58Updated 3 years ago
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated 2 months ago
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆18Updated 2 years ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated last year
- Code for Stage-wise Fine-tuning for Graph-to-Text Generation☆26Updated 2 years ago
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆32Updated 2 years ago
- Minimum Description Length probing for neural network representations☆19Updated last month
- Using short models to classify long texts☆21Updated 2 years ago
- Generative Retrieval Transformer☆28Updated last year
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- Transformers at any scale☆41Updated last year
- Embedding Recycling for Language models☆38Updated last year
- Code for "CyberWallE at SemEval-2020 Task 11: An Analysis of Feature Engineering for Ensemble Models for Propaganda Detection" (V. Blasch…☆9Updated 4 years ago
- Efficiently computing & storing token n-grams from large corpora☆19Updated 5 months ago
- Helper scripts and notes that were used while porting various nlp models☆45Updated 3 years ago
- LAReQA is a challenging benchmark for evaluating language agnostic answer retrieval from a multilingual candidate pool. This repository c…☆14Updated 4 years ago
- Code for ProtAugment: Unsupervised diverse short-texts paraphrasing for intent detection meta-learning☆21Updated 2 years ago
- ☆13Updated 6 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Updated 3 years ago