HPMLL/BurstGPT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HPMLL/BurstGPT)

HPMLL / BurstGPT

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

☆279

Alternatives and similar repositories for BurstGPT

Users that are interested in BurstGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LLMServe / DistServe
View on GitHub
Disaggregated serving system for Large Language Models (LLMs).
☆826Apr 6, 2025Updated last year
Azure / AzurePublicDataset
View on GitHub
Microsoft Azure Traces
☆1,158Jun 3, 2026Updated last month
microsoft / sarathi-serve
View on GitHub
A low-latency & high-throughput serving engine for LLMs
☆512Jan 8, 2026Updated 6 months ago
microsoft / vidur
View on GitHub
Accurate, large-scale, and extensible simulator for LLM inference Systems
☆645Jul 25, 2025Updated 11 months ago
alibaba / ServeGen
View on GitHub
A framework for generating realistic LLM serving workloads
☆163May 11, 2026Updated 2 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Hsword / SpotServe
View on GitHub
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆135Feb 22, 2024Updated 2 years ago
mutinifni / splitwise-sim
View on GitHub
LLM serving cluster simulator
☆157Apr 25, 2024Updated 2 years ago
alibaba-edu / qwen-bailian-usagetraces-anon
View on GitHub
☆150Apr 23, 2026Updated 2 months ago
galeselee / Awesome_LLM_System-PaperList
View on GitHub
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…
☆284Mar 6, 2025Updated last year
LoongServe / LoongServe
View on GitHub
☆135Nov 11, 2024Updated last year
James-QiuHaoran / LLM-serving-with-proxy-models
View on GitHub
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an …
☆52Jun 1, 2024Updated 2 years ago
interestingLSY / swiftLLM
View on GitHub
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of …
☆329Jun 10, 2025Updated last year
EfficientLLMSys / MuxServe
View on GitHub
☆15Jun 26, 2024Updated 2 years ago
llumnix-project / llumnix-ray
View on GitHub
Efficient and easy multi-instance LLM serving
☆563Mar 12, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
microsoft / ParrotServe
View on GitHub
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
☆222Sep 21, 2024Updated last year
snu-comparch / InfiniGen
View on GitHub
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
☆192Jul 10, 2024Updated 2 years ago
AmberLJC / LLMSys-PaperList
View on GitHub
Large Language Model (LLM) Systems Paper List
☆2,197Jul 15, 2026Updated last week
S-Lab-System-Group / Awesome-DL-Scheduling-Papers
View on GitHub
☆332Jan 22, 2024Updated 2 years ago
ServerlessLLM / ServerlessLLM
View on GitHub
Serverless LLM Serving for Everyone.
☆693May 4, 2026Updated 2 months ago
InternLM / AcmeTrace
View on GitHub
☆179Mar 12, 2024Updated 2 years ago
alpa-projects / mms
View on GitHub
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆94Jul 14, 2023Updated 3 years ago
casys-kaist / LLMServingSim
View on GitHub
LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure
☆344Jul 15, 2026Updated last week
blitz-serving / trace-replayer
View on GitHub
Repo to replay Qwen trace
☆31Jan 9, 2026Updated 6 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Hanchenli / vllm-continuum
View on GitHub
Preview Code for Continuum Paper
☆89Updated this week
microsoft / mscclpp
View on GitHub
MSCCL++: A GPU-driven communication stack for scalable AI applications
☆542Updated this week
IBM / LLM-performance-prediction
View on GitHub
Predict the performance of LLM inference services
☆23Sep 18, 2025Updated 10 months ago
lambda7xx / awesome-AI-system
View on GitHub
paper and its code for AI System
☆374May 14, 2026Updated 2 months ago
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆5,941Updated this week
dywsjtu / apparate
View on GitHub
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆24Nov 21, 2024Updated last year
byungsoo-oh / ml-systems-papers
View on GitHub
Curated collection of papers in machine learning systems
☆632Feb 7, 2026Updated 5 months ago
thustorage / Medusa
View on GitHub
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆47May 13, 2025Updated last year
hao-ai-lab / vllm-ltr
View on GitHub
[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
☆81Nov 4, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
eth-easl / orion
View on GitHub
An interference-aware scheduler for fine-grained GPU sharing
☆163Nov 26, 2025Updated 7 months ago
microsoft / vattention
View on GitHub
Dynamic Memory Management for Serving LLMs without PagedAttention
☆504Updated this week
ovg-project / kvcached
View on GitHub
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
☆1,107Updated this week
efeslab / Nanoflow
View on GitHub
A throughput-oriented high-performance serving framework for LLMs
☆968Mar 29, 2026Updated 3 months ago
alibaba / llm-scheduling-artifact
View on GitHub
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆64Jun 5, 2024Updated 2 years ago
S-Lab-System-Group / ChronusArtifact
View on GitHub
☆23Jan 7, 2022Updated 4 years ago
hao-ai-lab / MuxServe
View on GitHub
☆90Oct 17, 2025Updated 9 months ago