lucidrains / soft-moe-pytorchView external linksLinks
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
☆344Apr 2, 2025Updated 10 months ago
Alternatives and similar repositories for soft-moe-pytorch
Users that are interested in soft-moe-pytorch are comparing it to the libraries listed below
Sorting:
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆82Oct 5, 2023Updated 2 years ago
- ☆705Dec 6, 2025Updated 2 months ago
- A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models☆848Sep 13, 2023Updated 2 years ago
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆68Aug 22, 2023Updated 2 years ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆123Oct 17, 2024Updated last year
- A collection of AWESOME things about mixture-of-experts☆1,262Dec 8, 2024Updated last year
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT