DataStates / datastates-llmLinks

LLM checkpointing for DeepSpeed/Megatron

☆17

Alternatives and similar repositories for datastates-llm

Users that are interested in datastates-llm are comparing it to the libraries listed below

Sorting:

SymbioticLab / Oobleck
A resilient distributed training framework
☆95Updated last year
kungfu-team / tenplex
Dynamic resources changes for multi-dimensional parallelism training
☆25Updated 7 months ago
ByteDance-Seed / StragglerAnalysis
☆31Updated last month
casys-kaist / EnvPipe
☆25Updated last year
WukLab / preble
Stateful LLM Serving
☆73Updated 3 months ago
thu-pacman / FasterMoE
☆84Updated 3 years ago
msr-fiddle / CheckFreq
☆53Updated 4 years ago
hao-ai-lab / MuxServe
☆62Updated last year
osayamenja / Kleos
Complete GPU residency for ML.
☆17Updated last week
NEO-MLSys25 / NEO
NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
☆39Updated last week
eth-easl / orion
An interference-aware scheduler for fine-grained GPU sharing
☆140Updated 5 months ago
msr-fiddle / harmony
☆16Updated 2 years ago
alibaba / llm-scheduling-artifact
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆61Updated last year
cirquit / presto
☆15Updated 2 years ago
LoongServe / LoongServe
☆104Updated 7 months ago
ParCIS / Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
☆67Updated 3 months ago
Relaxed-System-Lab / HexGen
[ICML 2024] Serving LLMs on heterogeneous decentralized clusters.
☆26Updated last year
infinigence / FlashOverlap
A lightweight design for computation-communication overlap.
☆143Updated last week
jasperzhong / swift
☆14Updated 3 years ago
SJTU-IPADS / disb
DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.
☆52Updated 10 months ago
Ying1123 / VTC-artifact
☆32Updated last year
zhuohan123 / terapipe
☆74Updated 4 years ago
awslabs / optimizing-multitask-training-through-dynamic-pipelines
Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines
☆19Updated last year
LighT-chenml / LightCheck
☆9Updated last year
msr-fiddle / blox
☆44Updated 11 months ago
microsoft / RetrievalAttention
Scalable long-context LLM decoding that leverages sparsity—by treating the KV cache as a vector storage system.
☆54Updated last week
dywsjtu / apparate
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆25Updated 7 months ago
msr-fiddle / CoorDL
☆24Updated 2 years ago
zhengzangw / Sequence-Scheduling
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
☆88Updated 2 years ago
DachengLi1 / AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
☆39Updated 2 years ago