giangdip2410 / HyperRouter

Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"
31Updated 11 months ago

Related projects

Alternatives and complementary repositories for HyperRouter