Hsword / HetuLinks

A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, please visit/star/fork https://github.com/PKU-DAIR/Hetu

☆122

Alternatives and similar repositories for Hetu

Users that are interested in Hetu are comparing it to the libraries listed below

Sorting:

PKU-DAIR / Hetu
A high-performance distributed deep learning system targeting large-scale and automated distributed training.
☆324Updated 2 months ago
thu-pacman / SmartMoE-AE
ATC23 AE
☆47Updated 2 years ago
microsoft / nnscaler
nnScaler: Compiling DNN models for Parallel Training
☆117Updated last month
zhuohan123 / terapipe
☆75Updated 4 years ago
thu-pacman / FasterMoE
☆87Updated 3 years ago
DachengLi1 / AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
☆41Updated 2 years ago
SJTU-IPADS / disb
DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.
☆54Updated last year
kwai / Megatron-Kwai
[USENIX ATC '24] Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Paral…
☆66Updated last year
ConnollyLeon / awesome-Auto-Parallelism
A baseline repository of Auto-Parallelism in Training Neural Networks
☆147Updated 3 years ago
HPDL-Group / Merak
☆81Updated 5 months ago
ParCIS / Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
☆67Updated 7 months ago
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆88Updated 2 years ago
alibaba / easydist
Automated Parallelization System and Infrastructure for Multiple Ecosystems
☆80Updated 11 months ago
Relaxed-System-Lab / HexGen
[ICML 2024] Serving LLMs on heterogeneous decentralized clusters.
☆30Updated last year
alibaba / llm-scheduling-artifact
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆62Updated last year
AlibabaResearch / flash-llm
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
☆221Updated 2 years ago
microsoft / ParrotServe
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
☆188Updated last year
SymbioticLab / Oobleck
A resilient distributed training framework
☆96Updated last year
WukLab / preble
Stateful LLM Serving
☆87Updated 7 months ago
AlibabaPAI / FLASHNN
☆100Updated last year
InternLM / Awesome-LLM-Training-System
☆43Updated last year
lambda7xx / awesome-AI-system
paper and its code for AI System
☆331Updated 2 months ago
infinigence / FlashOverlap
A lightweight design for computation-communication overlap.
☆181Updated 2 weeks ago
zhengzangw / Sequence-Scheduling
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
☆92Updated 2 years ago
microsoft / SuperScaler
An experimental parallel training platform
☆54Updated last year
AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
☆76Updated 4 years ago
hao-ai-lab / MuxServe
☆74Updated last week
CalvinXKY / mfu_calculation
A simple calculation for LLM MFU.
☆48Updated last month
parasailteam / coconet
☆83Updated 2 years ago
Youhe-Jiang / IJCAI2023-OptimalShardedDataParallel
[IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any inte…
☆52Updated 2 years ago