thu-ml / ReMoE

Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.
57Updated 2 months ago

Alternatives and similar repositories for ReMoE:

Users that are interested in ReMoE are comparing it to the libraries listed below