awslabs / Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-Overlapping

Official implementation for the paper Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping, published in MLSys'24.
11Updated 7 months ago

Alternatives and similar repositories for Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-Overlapping:

Users that are interested in Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-Overlapping are comparing it to the libraries listed below