junfanz1 / MoE-Mixture-of-Experts-in-PyTorch
View external linksLinks

Implementations of a Mixture-of-Experts (MoE) architecture designed for research on large language models (LLMs) and scalable neural network designs. One implementation targets a **single-device/NPU environment** while the other is built for multi-device distributed computing. Both versions showcase the core principles.

☆55

Alternatives and similar repositories for MoE-Mixture-of-Experts-in-PyTorch

Users that are interested in MoE-Mixture-of-Experts-in-PyTorch are comparing it to the libraries listed below

Sorting:

The-Swarm-Corporation / Mamba-R1
View on GitHub
Mamba R1 represents a novel architecture that combines the efficiency of Mamba's state space models with the scalability of Mixture of Ex…
☆25Oct 13, 2025Updated 4 months ago
yanzhanglab / Graph2GO
View on GitHub
Graph-based representation learning method for protein function prediction
☆24Aug 25, 2025Updated 5 months ago
lumimim / NeuPRINT
View on GitHub
Official implementation of Neuronal Time-Invariant Representations (NeuPRINT), NeurIPS 2023
☆10Apr 17, 2024Updated last year
YihePang / IDP-LM
View on GitHub
Computational predictor of protein intrinsic disorder and its functions
☆10Dec 4, 2023Updated 2 years ago
jcottaar / seismic
View on GitHub
Jeroen Cottaar's work for the Kaggle Geophysical Waveform Inversion competition (2nd place)
☆11Aug 11, 2025Updated 6 months ago
yaozhspider / LLMs-Sec-Eval
View on GitHub
☆11Dec 5, 2024Updated last year
OPUS-MaLab / opus_rota4
View on GitHub
OPUS-Rota4: A Gradient-Based Protein Side-Chain Modeling Framework Assisted by Deep Learning-Based Predictors
☆11Apr 14, 2022Updated 3 years ago
antonio-f / mixture-of-experts-from-scratch
View on GitHub
Mixture of Experts from scratch
☆13Apr 12, 2024Updated last year
starjob42 / Starjob
View on GitHub
JSSP dataset for LLMs
☆17May 29, 2025Updated 8 months ago
guolinke / fused_ops
View on GitHub
☆10Aug 15, 2022Updated 3 years ago
Westlake-drug-discovery / AlphaDesign
View on GitHub
☆13May 25, 2022Updated 3 years ago
elliebirbeck / sklearn-tutorial
View on GitHub
An Image Recognition tutorial written for the HyperionDev blog
☆10Dec 19, 2017Updated 8 years ago
huggingface / hf-rocm-kernels
View on GitHub
☆23Jul 11, 2025Updated 7 months ago
erthorpabar / pytorch-DecoderOnly-model
View on GitHub
gpt from 0 -> 1
☆11Oct 9, 2025Updated 4 months ago
arontier / A_Prot_Paper
View on GitHub
A Prot paper related materials
☆11Sep 5, 2022Updated 3 years ago
ChenZiHong-Gavin / MoE-Visualizer
View on GitHub
MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.
☆16Apr 8, 2025Updated 10 months ago
lsbuschoff / multimodal
View on GitHub
☆14Mar 15, 2025Updated 11 months ago
DrHB / rna-stanford
View on GitHub
Transformer + GAT for RNA chemical reactivity prediction| Stanford Ribonanza
☆11Jan 28, 2026Updated 2 weeks ago
Fleishman-Lab / AbDesign_for_enzymes
View on GitHub
Scripts and data to run AbDesign as described in Tools for protein science 2021
☆14Nov 4, 2020Updated 5 years ago
yiksiu-chan / SpeakEasy
View on GitHub
[ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions
☆14Sep 27, 2025Updated 4 months ago
simveit / persistent_dense_gemm
View on GitHub
Persistent dense gemm for Hopper in `CuTeDSL`
☆15Aug 9, 2025Updated 6 months ago
thuxugang / opus_rota4
View on GitHub
OPUS-Rota4: A Gradient-Based Protein Side-Chain Modeling Framework Assisted by Deep Learning-Based Predictors
☆10Apr 14, 2022Updated 3 years ago
ZeweiChu / NatCat
View on GitHub
natural annotated text-category pairs for text classification
☆10Sep 10, 2021Updated 4 years ago
ratthachat / deep-rna
View on GitHub
Easy & Pretrained SOTA Deep Learning for RNA strings
☆12Apr 15, 2022Updated 3 years ago
hassonlab / b2b-linguistic-coupling
View on GitHub
☆12Jun 22, 2024Updated last year
WorldEditors / EvolvingPlasticANN
View on GitHub
Codes for Evolving Plastic ANNs
☆14Dec 18, 2022Updated 3 years ago
DrHB / 2nd-place-contrails
View on GitHub
Multi-encoder segmentation for contrail detection in satellite imagery | Google Researc
☆11Jan 28, 2026Updated 2 weeks ago
ocular-motor-lab / OpenIris
View on GitHub
☆14Aug 18, 2025Updated 5 months ago
SDML-KU / qkvflow
View on GitHub
Neural ODE Transformers (ICLR 2025)
☆17Sep 6, 2025Updated 5 months ago
Justherozen / FlowBench
View on GitHub
[EMNLP 2024] FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents
☆20Jan 6, 2025Updated last year
dpmlab / img2fmri
View on GitHub
A python package for predicting group-level fMRI responses to visual stimuli using deep neural networks
☆13Mar 31, 2025Updated 10 months ago
ademeure / QuickRunCUDA
View on GitHub
☆15Oct 30, 2025Updated 3 months ago
mb-BCA / NetDynFlow
View on GitHub
A package to study complex networks based on the temporal evolution of their Dynamic Communicability and Flow.
☆11Jan 30, 2026Updated 2 weeks ago
GraphMoLab / Graph2Token
View on GitHub
☆12Jul 2, 2025Updated 7 months ago
BiEchi / DistributedTrainingGPT2
View on GitHub
基于PyTorch GPT-2的针对各种数据并行pretrain的研究代码.
☆11Dec 16, 2022Updated 3 years ago
ethanmock / UniPMT
View on GitHub
The code for UniPMT
☆18Mar 16, 2025Updated 11 months ago
luongthecong123 / fp8-quant-matmul
View on GitHub
Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.
☆17Feb 9, 2026Updated last week
DAMO-NLP-SG / MT-LLaMA
View on GitHub
Multi-Task instruction-tuned LLaMA
☆14May 5, 2023Updated 2 years ago
NeurAI-Lab / SCoMMER
View on GitHub
The official PyTorch code for AAAI'23 Paper "Sparse Coding in a Dual Memory System for Lifelong Learning"
☆12Feb 15, 2023Updated 3 years ago

junfanz1 / MoE-Mixture-of-Experts-in-PyTorchView external linksLinks

Alternatives and similar repositories for MoE-Mixture-of-Experts-in-PyTorch

junfanz1 / MoE-Mixture-of-Experts-in-PyTorch
View external linksLinks