FCSLab / torporLinks

☆17

Alternatives and similar repositories for torpor

Users that are interested in torpor are comparing it to the libraries listed below

Sorting:

NetX-lab / Echo-slowdown
Slowdown prediction module of Echo: Simulating Distributed Training at Scale
☆13Updated 8 months ago
mental2008 / awesome-papers
Here are my personal paper reading notes (including machine learning systems, AI infrastructure, and other interesting stuffs).
☆155Updated 2 weeks ago
pkusys / Auncel
Vector search with bounded performance.
☆35Updated 2 years ago
wassemgtk / MegaScale-Infer-Prototyp
Prototyp MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
☆26Updated 10 months ago
eth-easl / orion
An interference-aware scheduler for fine-grained GPU sharing
☆159Updated 2 months ago
SJTU-IPADS / PhoenixOS
Fast OS-level support for GPU checkpoint and restore
☆271Updated 4 months ago
quiver-team / quiver-feature
High performance RDMA-based distributed feature collection component for training GNN model on EXTREMELY large graph
☆56Updated 3 years ago
NetX-lab / Echo
Simulating Distributed Training at Scale
☆14Updated 4 months ago
hao-ai-lab / MuxServe
☆85Updated 3 months ago
WukLab / preble
Stateful LLM Serving
☆95Updated 11 months ago
alibaba / ServeGen
A framework for generating realistic LLM serving workloads
☆100Updated 4 months ago
microsoft / tokenweave
Accepted to MLSys 2026
☆70Updated 2 weeks ago
Hsword / SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆135Updated last year
denght23 / CAVER
NS3 simulator for RDMA load balancing
☆11Updated last year
pkusys / TGS
Artifacts for our NSDI'23 paper TGS
☆96Updated last year
UChi-JCL / CacheGen
☆150Updated last year
S-Lab-System-Group / Awesome-DL-Scheduling-Papers
☆323Updated 2 years ago
SJTU-IPADS / reef
REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…
☆104Updated 3 years ago
netx-repo / PipeSwitch
PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
☆127Updated 3 years ago
aliyun / aicb
☆231Updated last month
eth-easl / pccheck
☆12Updated last year
msr-fiddle / CheckFreq
☆56Updated 5 years ago
Sys-KU / DeepPlan
[ACM EuroSys 2023] Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access
☆56Updated 6 months ago
NEO-MLSys25 / NEO
NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
☆84Updated 7 months ago
alibaba / llm-scheduling-artifact
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆64Updated last year
MachineLearningSystem / 25ASPLOS-Medusa
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆12Updated last year
casys-kaist / glet
☆53Updated last year
rkhan055 / SHADE
SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
☆36Updated 2 years ago
DicardoX / Research-Space
This repository is established to store personal notes and annotated papers during daily research.
☆180Updated 3 weeks ago
eniac / paella
Paella: Low-latency Model Serving with Virtualized GPU Scheduling
☆68Updated last year