Mutinifni / splitwise-simLinks

LLM serving cluster simulator

☆135

Alternatives and similar repositories for splitwise-sim

Users that are interested in splitwise-sim are comparing it to the libraries listed below

Sorting:

HPMLL / BurstGPT
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆238Updated this week
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆93Updated 2 years ago
DicardoX / Research-Space
This repository is established to store personal notes and annotated papers during daily research.
☆180Updated 3 weeks ago
Thesys-lab / Helix-ASPLOS25
Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"
☆77Updated 3 months ago
sitar-lab / NeuSight
☆65Updated 7 months ago
Raphael-Hao / brainstorm
Compiler for Dynamic Neural Networks
☆45Updated 2 years ago
casys-kaist / LLMServingSim
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
☆178Updated 6 months ago
LoongServe / LoongServe
☆131Updated last year
JF-D / Proteus
☆24Updated last year
alibaba / llm-scheduling-artifact
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆64Updated last year
parasailteam / coconet
☆84Updated 3 years ago
abhibambhaniya / GenZ-LLM-Analyzer
LLM Inference analyzer for different hardware platforms
☆100Updated 2 months ago
SJTU-IPADS / reef
REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…
☆104Updated 3 years ago
pkusys / ElasticFlow
Artifacts for our ASPLOS'23 paper ElasticFlow
☆55Updated last year
UMass-LIDS / Proteus
Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling
☆12Updated last year
shenh10 / DeepSeek_Simulator
☆93Updated 10 months ago
eth-easl / orion
An interference-aware scheduler for fine-grained GPU sharing
☆159Updated 2 months ago
mental2008 / awesome-papers
Here are my personal paper reading notes (including machine learning systems, AI infrastructure, and other interesting stuffs).
☆155Updated last week
calculon-ai / calculon
☆166Updated last year
snu-comparch / InfiniGen
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
☆174Updated last year
ConnollyLeon / awesome-Auto-Parallelism
A baseline repository of Auto-Parallelism in Training Neural Networks
☆147Updated 3 years ago
Hsword / Awesome-Machine-Learning-System-Papers
☆79Updated 3 years ago
Raphael-Hao / Abacus
☆38Updated 7 months ago
microsoft / taccl
TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches
☆80Updated 2 years ago
chenhongyu2048 / LLM-inference-optimization-paper
Summary of some awesome work for optimizing LLM inference
☆172Updated 2 months ago
SJTU-ReArch-Group / Paper-Reading-List
☆145Updated last month
alibaba-edu / qwen-bailian-usagetraces-anon
☆78Updated 2 weeks ago
microsoft / msccl-tools
Synthesizer for optimal collective communication algorithms
☆124Updated last year
NetX-lab / Ayo
[ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo
☆62Updated 6 months ago
hao-ai-lab / MuxServe
☆84Updated 3 months ago