mutinifni / splitwise-simLinks

LLM serving cluster simulator

☆122

Alternatives and similar repositories for splitwise-sim

Users that are interested in splitwise-sim are comparing it to the libraries listed below

Sorting:

HPMLL / BurstGPT
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆220Updated 4 months ago
Thesys-lab / Helix-ASPLOS25
Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"
☆73Updated last month
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆91Updated 2 years ago
sitar-lab / NeuSight
☆55Updated 5 months ago
DicardoX / Research-Space
This repository is established to store personal notes and annotated papers during daily research.
☆165Updated this week
alibaba / llm-scheduling-artifact
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆63Updated last year
pkusys / ElasticFlow
Artifacts for our ASPLOS'23 paper ElasticFlow
☆55Updated last year
snu-comparch / InfiniGen
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
☆163Updated last year
JF-D / Proteus
☆23Updated last year
Raphael-Hao / brainstorm
Compiler for Dynamic Neural Networks
☆46Updated 2 years ago
casys-kaist / LLMServingSim
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
☆160Updated 4 months ago
LoongServe / LoongServe
☆124Updated last year
mental2008 / awesome-papers
Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and o…
☆137Updated 3 weeks ago
abhibambhaniya / GenZ-LLM-Analyzer
LLM Inference analyzer for different hardware platforms
☆96Updated 4 months ago
calculon-ai / calculon
☆158Updated last year
LLMServe / dLoRA-artifact
☆27Updated last year
Hsword / Awesome-Machine-Learning-System-Papers
☆79Updated 3 years ago
alibaba-edu / qwen-bailian-usagetraces-anon
☆61Updated last month
parasailteam / coconet
☆83Updated 3 years ago
SJTU-IPADS / reef
REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…
☆103Updated 2 years ago
Hsword / SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆132Updated last year
eth-easl / orion
An interference-aware scheduler for fine-grained GPU sharing
☆153Updated this week
shenh10 / DeepSeek_Simulator
☆90Updated 8 months ago
WukLab / preble
Stateful LLM Serving
☆89Updated 8 months ago
chenhongyu2048 / LLM-inference-optimization-paper
Summary of some awesome work for optimizing LLM inference
☆139Updated this week
UMass-LIDS / Proteus
Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling
☆12Updated last year
hao-ai-lab / MuxServe
☆79Updated last month
NetX-lab / Ayo
[ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo
☆51Updated 3 months ago
hongzhangblaze / CS854-F24
☆52Updated 2 months ago
lambda7xx / awesome-AI-system
paper and its code for AI System
☆339Updated 3 months ago