SamsungLabs / MetisLinks

[ATC '24] Metis: Fast automatic distributed training on heterogeneous GPUs (https://www.usenix.org/conference/atc24/presentation/um)

☆26

Alternatives and similar repositories for Metis

Users that are interested in Metis are comparing it to the libraries listed below

Sorting:

Relaxed-System-Lab / HexGen
[ICML 2024] Serving LLMs on heterogeneous decentralized clusters.
☆25Updated last year
HPMLL / BurstGPT
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆171Updated 7 months ago
eth-easl / orion
An interference-aware scheduler for fine-grained GPU sharing
☆137Updated 4 months ago
mutinifni / splitwise-sim
LLM serving cluster simulator
☆100Updated last year
hao-ai-lab / MuxServe
☆62Updated 11 months ago
LoongServe / LoongServe
☆99Updated 6 months ago
SymbioticLab / Oobleck
A resilient distributed training framework
☆95Updated last year
JF-D / Parcae
☆18Updated last year
JF-D / Proteus
☆23Updated 10 months ago
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆81Updated last year
UMass-LIDS / Proteus
Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling
☆12Updated last year
DicardoX / Research-Space
This repository is established to store personal notes and annotated papers during daily research.
☆125Updated this week
Hsword / Awesome-Machine-Learning-System-Papers
☆73Updated 3 years ago
zhengzangw / Sequence-Scheduling
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
☆86Updated 2 years ago
snu-comparch / InfiniGen
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
☆134Updated 10 months ago
Thesys-lab / Helix-ASPLOS25
Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"
☆46Updated 6 months ago
microsoft / nnscaler
nnScaler: Compiling DNN models for Parallel Training
☆113Updated last month
abhibambhaniya / GenZ-LLM-Analyzer
LLM Inference analyzer for different hardware platforms
☆69Updated this week
mental2008 / awesome-papers
Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and o…
☆103Updated last week
Raphael-Hao / Abacus
☆37Updated 3 years ago
infinigence / FlashOverlap
A lightweight design for computation-communication overlap.
☆132Updated 3 weeks ago
microsoft / SuperScaler
An experimental parallel training platform
☆54Updated last year
WukLab / preble
Stateful LLM Serving
☆70Updated 2 months ago
Hsword / SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆121Updated last year
pkusys / ElasticFlow
Artifacts for our ASPLOS'23 paper ElasticFlow
☆51Updated last year
kungfu-team / tenplex
Dynamic resources changes for multi-dimensional parallelism training
☆25Updated 6 months ago
LLMServe / SwiftTransformer
High performance Transformer implementation in C++.
☆124Updated 4 months ago
ruipeterpan / marconi
Artifact for "Marconi: Prefix Caching for the Era of Hybrid LLMs" [MLSys '25 Outstanding Paper Honorable Mention]
☆10Updated 2 months ago
msr-fiddle / synergy
☆49Updated 2 years ago
eniac / paella
Paella: Low-latency Model Serving with Virtualized GPU Scheduling
☆58Updated last year