ngoyal2707 / Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆18Updated last year
Alternatives and similar repositories for Megatron-LM:
Users that are interested in Megatron-LM are comparing it to the libraries listed below
- Scripts supporting the development and serving the Roots Search Tool - https://hf.co/spaces/bigscience-data/roots-search☆10Updated last year
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)☆10Updated 2 years ago
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated last year
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆32Updated last year
- ☆22Updated 2 years ago
- Documentation effort for the BookCorpus dataset☆33Updated 3 years ago
- Code for Stage-wise Fine-tuning for Graph-to-Text Generation☆26Updated last year
- ☆11Updated 2 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Efficiently computing & storing token n-grams from large corpora☆17Updated 3 months ago
- ☆20Updated 3 years ago
- Hugging Face and Pyserini interoperability☆20Updated last year
- CyBERTron-LM is a project which collects some pre-trained Transformer-based models.☆12Updated last year
- ☆21Updated last year
- ☆16Updated last year
- Open source library for few shot NLP☆77Updated last year
- PALM: Pre-training an Autoencoding & Autoregressive Language Model for Context-conditioned Generation☆34Updated last year
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆18Updated 2 years ago
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Lo…☆39Updated last year
- This repo contains all of the code for my Youtube series on how to create a VSCode extension for autocompleting code using Deep Learning!☆15Updated 3 years ago
- DeFacto - Demonstrations and Feedback for improving factual consistency of text summarization☆27Updated 2 years ago
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated last year
- Code for the paper: Saying No is An Art: Contextualized Fallback Responses for Unanswerable Dialogue Queries☆19Updated 3 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Updated 3 years ago
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆22Updated 2 weeks ago
- Transformers at any scale☆41Updated last year
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Updated last year
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆57Updated 2 years ago