awslabs / Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-OverlappingLinks

Official implementation for the paper Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping, published in MLSys'24.
12Updated 8 months ago

Alternatives and similar repositories for Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-Overlapping

Users that are interested in Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-Overlapping are comparing it to the libraries listed below

Sorting: