microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
1,900Updated last month

Related projects

Alternatives and complementary repositories for Megatron-DeepSpeed