microsoft / deepspeed-gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
☆19Updated last year
Related projects ⓘ
Alternatives and complementary repositories for deepspeed-gpt-neox
- Megatron LM 11B on Huggingface Transformers☆27Updated 3 years ago
- ☆23Updated last year
- ☆31Updated 2 years ago
- data related codebase for polyglot project☆19Updated last year
- 🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.☆13Updated this week
- Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.☆32Updated 2 years ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- Convenient Text-to-Text Training for Transformers☆19Updated 2 years ago
- MeCab model trained with OpenKorPos.☆22Updated 2 years ago
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆58Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆92Updated last year
- BERT, RoBERTa fine-tuning over SQuAD Dataset using pytorch-lightning⚡️, 🤗-transformers & 🤗-nlp.☆36Updated last year
- Generative Retrieval Transformer☆29Updated last year
- ↔️ T5 Machine Translation from English to Korean☆17Updated 2 years ago
- A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!☆111Updated last year
- Implementation of stop sequencer for Huggingface Transformers☆15Updated last year
- Repo for "Smart Word Suggestions" (SWS) task and benchmark☆19Updated 11 months ago
- Experiments with generating opensource language model assistants☆97Updated last year
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- ☆36Updated 3 months ago
- GPT-jax based on the official huggingface library☆13Updated 3 years ago
- Improving Neural Text Generation with Reinforcement Learning☆21Updated 3 years ago
- A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆26Updated 2 years ago
- ☆76Updated 11 months ago
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretraining☆12Updated 11 months ago
- 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.☆82Updated 2 years ago
- NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)☆36Updated 3 years ago