microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
1,884Updated 3 weeks ago

Related projects

Alternatives and complementary repositories for Megatron-DeepSpeed