Hunter-DDM / stablemoeLinks
Code for the ACL-2022 paper "StableMoE: Stable Routing Strategy for Mixture of Experts"
☆47Updated 2 years ago
Alternatives and similar repositories for stablemoe
Users that are interested in stablemoe are comparing it to the libraries listed below
Sorting:
- This package implements THOR: Transformer with Stochastic Experts.☆65Updated 3 years ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆76Updated last year