thu-ml / ReMoE

Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.
52Updated 3 weeks ago

Alternatives and similar repositories for ReMoE:

Users that are interested in ReMoE are comparing it to the libraries listed below