A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
☆860Sep 13, 2023Updated 2 years ago
Alternatives and similar repositories for mixture-of-experts
Users that are interested in mixture-of-experts are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538☆1,246Apr 19, 2024Updated 2 years ago
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆385Jun 17, 2024Updated 2 years ago
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆347Apr 2, 2025Updated last year
- A collection of AWESOME things about mixture-of-experts☆1,280Dec 8, 2024Updated last year
- A fast MoE impl for PyTorch☆1,855Feb 10, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆724Jun 6, 2026Updated last week
- Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4☆993Jun 4, 2026Updated last week
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆122Oct 17, 2024Updated last year
- A curated reading list of research in Mixture-of-Experts(MoE).☆667Oct 30, 2024Updated last year
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…