nawnoes / pytorch-gpt-x
Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.
☆32Updated 3 years ago
Alternatives and similar repositories for pytorch-gpt-x:
Users that are interested in pytorch-gpt-x are comparing it to the libraries listed below
- Calculating Expected Time for training LLM.☆38Updated last year
- KETOD Knowledge-Enriched Task-Oriented Dialogue☆32Updated 2 years ago
- Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper☆52Updated last year
- A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering☆16Updated 2 years ago
- NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)☆36Updated 3 years ago
- The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)☆52Updated 2 years ago
- Don't Judge a Language Model by Its Last Layer: Contrastive Learning with Layer-Wise Attention Pooling☆9Updated 2 years ago
- About, prompt-based few-shot learning, Text Generation with Prompting☆13Updated last year
- ☆11Updated 4 years ago
- Convenient Text-to-Text Training for Transformers☆19Updated 3 years ago
- Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3☆23Updated 3 years ago
- ☆11Updated 4 years ago
- Implementation of stop sequencer for Huggingface Transformers☆16Updated last year
- PyTorch reimplementation of REALM and ORQA☆22Updated 3 years ago
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆73Updated 2 years ago
- python project template for personal projects! 🙋 ♀️☆10Updated 4 years ago
- This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.☆47Updated 2 years ago
- ☆35Updated last year
- ☆37Updated 8 months ago
- Megatron LM 11B on Huggingface Transformers☆27Updated 3 years ago
- ☆24Updated 2 years ago
- Plug-and-Play Conversational Models☆29Updated 2 years ago
- Long-context pretrained encoder-decoder models☆94Updated 2 years ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- Keep Me Updated! Memory Management in Long-term Conversations (Findings of EMNLP 2022)☆29Updated 2 years ago
- Source code for paper: Knowledge Inheritance for Pre-trained Language Models☆38Updated 2 years ago
- [NAACL 2021] Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering☆36Updated 3 years ago
- Abstractive summarization using Bert2Bert framework.☆31Updated 4 years ago
- Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks☆63Updated 3 years ago
- Knowledge Infused Decoding☆71Updated last year