HPDL-Group / MerakLinks

☆81

Alternatives and similar repositories for Merak

Users that are interested in Merak are comparing it to the libraries listed below

Sorting:

zhuohan123 / terapipe
☆75Updated 4 years ago
ConnollyLeon / awesome-Auto-Parallelism
A baseline repository of Auto-Parallelism in Training Neural Networks
☆147Updated 3 years ago
microsoft / nnscaler
nnScaler: Compiling DNN models for Parallel Training
☆118Updated last month
ParCIS / Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
☆67Updated 7 months ago
parasailteam / coconet
☆83Updated 2 years ago
AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
☆76Updated 4 years ago
microsoft / msccl-tools
Synthesizer for optimal collective communication algorithms
☆118Updated last year
microsoft / SparTA
☆153Updated last year
calculon-ai / calculon
☆154Updated last year
infinigence / FlashOverlap
A lightweight design for computation-communication overlap.
☆181Updated 2 weeks ago
thu-pacman / PET
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆122Updated 3 years ago
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆88Updated 2 years ago
mutinifni / splitwise-sim
LLM serving cluster simulator
☆116Updated last year
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆43Updated 2 years ago
AlibabaResearch / flash-llm
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
☆221Updated 2 years ago
yifuwang / symm-mem-recipes
☆141Updated 9 months ago
awslabs / raf
☆145Updated 8 months ago
thu-pacman / FasterMoE
☆87Updated 3 years ago
UofT-EcoSystem / DietCode
DietCode Code Release
☆65Updated 3 years ago
HPMLL / BurstGPT
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆213Updated 3 months ago
microsoft / msccl
Microsoft Collective Communication Library
☆367Updated 2 years ago
microsoft / SuperScaler
An experimental parallel training platform
☆54Updated last year
tlc-pack / tenset
☆93Updated 2 years ago
DachengLi1 / AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
☆41Updated 2 years ago
osayamenja / FlashMoE
Distributed MoE in a Single Kernel [NeurIPS '25]
☆89Updated 3 weeks ago
sail-sg / zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism
☆432Updated 5 months ago
facebookexperimental / triton
Github mirror of trition-lang/triton repo.
☆86Updated this week
UofT-EcoSystem / hfta
Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion
☆32Updated last year
fanshiqing / grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM.
☆156Updated 2 weeks ago
alibaba / llm-scheduling-artifact
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆62Updated last year