bigscience-workshop / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
1,338Updated 8 months ago

Related projects

Alternatives and complementary repositories for Megatron-DeepSpeed