alpa-projects / tensorflow-alpaLinks

☆20

Alternatives and similar repositories for tensorflow-alpa

Users that are interested in tensorflow-alpa are comparing it to the libraries listed below

Sorting:

parasailteam / coconet
☆79Updated 2 years ago
microsoft / SuperScaler
An experimental parallel training platform
☆54Updated last year
AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
☆76Updated 4 years ago
ConnollyLeon / awesome-Auto-Parallelism
A baseline repository of Auto-Parallelism in Training Neural Networks
☆144Updated 3 years ago
openxla / shardy
MLIR-based partitioning system
☆97Updated this week
microsoft / msccl-tools
Synthesizer for optimal collective communication algorithms
☆108Updated last year
yifuwang / symm-mem-recipes
☆90Updated 6 months ago
ColfaxResearch / cfx-article-src
☆117Updated last month
Azure / msccl
Microsoft Collective Communication Library
☆64Updated 7 months ago
ParCIS / Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
☆67Updated 3 months ago
microsoft / nnscaler
nnScaler: Compiling DNN models for Parallel Training
☆113Updated last week
zhuohan123 / terapipe
☆74Updated 4 years ago
microsoft / NPKit
NCCL Profiling Kit
☆138Updated 11 months ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆41Updated 2 years ago
eniac / paella
Paella: Low-latency Model Serving with Virtualized GPU Scheduling
☆59Updated last year
UofT-EcoSystem / hfta
Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion
☆32Updated last year
iree-org / iree-experimental
Experiments and prototypes associated with IREE or MLIR
☆50Updated 10 months ago
msr-fiddle / piper
☆9Updated 3 years ago
microsoft / msccl
Microsoft Collective Communication Library
☆350Updated last year
apache / tvm-rfcs
A home for the final text of all TVM RFCs.
☆105Updated 9 months ago
tlc-pack / tenset
☆92Updated 2 years ago
awslabs / raf
☆144Updated 4 months ago
nox-410 / tvm.tl
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
☆50Updated 11 months ago
google / iopddl
Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning
☆23Updated last month
uccl-project / uccl
Ultra and Unified CCL
☆165Updated this week
SJTU-IPADS / reef
REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…
☆94Updated 2 years ago
sunlex0717 / DissectingTensorCores
☆98Updated last year
UofT-EcoSystem / hotline
☆32Updated 2 years ago
microsoft / triton-shared
Shared Middle-Layer for Triton Compilation
☆256Updated this week
thu-pacman / PET
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆121Updated 3 years ago