ParCIS / ChimeraLinks

Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.

☆68

Alternatives and similar repositories for Chimera

Users that are interested in Chimera are comparing it to the libraries listed below

Sorting:

zhuohan123 / terapipe
☆77Updated 4 years ago
infinigence / FlashOverlap
A lightweight design for computation-communication overlap.
☆190Updated last month
microsoft / nnscaler
nnScaler: Compiling DNN models for Parallel Training
☆120Updated 2 months ago
microsoft / SparTA
☆159Updated last year
parasailteam / coconet
☆83Updated 3 years ago
alibaba / easydist
Automated Parallelization System and Infrastructure for Multiple Ecosystems
☆80Updated last year
AlibabaResearch / flash-llm
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
☆224Updated 2 years ago
hao-ai-lab / MuxServe
☆79Updated last month
awslabs / optimizing-multitask-training-through-dynamic-pipelines
Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines
☆20Updated last year
facebookexperimental / triton
Github mirror of trition-lang/triton repo.
☆100Updated this week
ConnollyLeon / awesome-Auto-Parallelism
A baseline repository of Auto-Parallelism in Training Neural Networks
☆147Updated 3 years ago
HPDL-Group / Merak
☆81Updated 6 months ago
LoongServe / LoongServe
☆124Updated last year
yifuwang / symm-mem-recipes
☆148Updated 11 months ago
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆91Updated 2 years ago
LLMServe / SwiftTransformer
High performance Transformer implementation in C++.
☆142Updated 10 months ago
Relaxed-System-Lab / HexGen
[ICML 2024] Serving LLMs on heterogeneous decentralized clusters.
☆31Updated last year
KuangjuX / NVSHMEM-Tutorial
NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer
☆144Updated 2 months ago
AlibabaPAI / FLASHNN
☆102Updated last year
SymbioticLab / Oobleck
A resilient distributed training framework
☆96Updated last year
thu-pacman / FasterMoE
☆88Updated 3 years ago
DachengLi1 / AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
☆43Updated 3 years ago
Raphael-Hao / brainstorm
Compiler for Dynamic Neural Networks
☆46Updated 2 years ago
tgale96 / grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM.
☆131Updated 6 months ago
flexflow / flexflow-serve
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
☆63Updated 2 months ago
mcrl / tccl
Thunder Research Group's Collective Communication Library
☆43Updated 4 months ago
ranggihwang / Pregated_MoE
☆57Updated last year
thunlp / Seq1F1B
Sequence-level 1F1B schedule for LLMs.
☆37Updated 3 months ago
stepfun-ai / StepMesh
☆324Updated 3 weeks ago
Azure / msccl
Microsoft Collective Communication Library
☆66Updated last year