SJTU-IPADS / PhoenixOSLinks

Fast OS-level support for GPU checkpoint and restore

☆245

Alternatives and similar repositories for PhoenixOS

Users that are interested in PhoenixOS are comparing it to the libraries listed below

Sorting:

open-neutrino / neutrino
☆194Updated 2 months ago
uccl-project / uccl
Ultra and Unified CCL
☆595Updated this week
mental2008 / awesome-papers
Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and o…
☆126Updated last week
eth-easl / orion
An interference-aware scheduler for fine-grained GPU sharing
☆147Updated 8 months ago
bytedance / InfiniStore
KV cache store for distributed LLM inference
☆345Updated last month
SJTU-IPADS / reef
REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…
☆101Updated 2 years ago
XpuOS / xsched
A scheduling framework for multitasking over diverse XPUs, including GPUs, NPUs, ASICs, and FPGAs
☆120Updated last month
WukLab / preble
Stateful LLM Serving
☆86Updated 7 months ago
LLMServe / SwiftTransformer
High performance Transformer implementation in C++.
☆135Updated 9 months ago
pkusys / TGS
Artifacts for our NSDI'23 paper TGS
☆89Updated last year
alibaba-edu / qwen-bailian-usagetraces-anon
☆55Updated 4 months ago
zartbot / shallowsim
DeepSeek-V3/R1 inference performance simulator
☆170Updated 6 months ago
shenh10 / DeepSeek_Simulator
☆90Updated 6 months ago
microsoft / NPKit
NCCL Profiling Kit
☆145Updated last year
NEO-MLSys25 / NEO
NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
☆67Updated 4 months ago
infinigence / Semi-PD
A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.
☆112Updated 5 months ago
AlibabaPAI / llumnix
Efficient and easy multi-instance LLM serving
☆497Updated last month
stepfun-ai / StepMesh
☆307Updated 3 weeks ago
alibaba / llm-scheduling-artifact
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆62Updated last year
Hsword / Awesome-Machine-Learning-System-Papers
☆77Updated 3 years ago
SJTU-IPADS / PhoenixOS-Remoting
☆21Updated 3 months ago
infinigence / FlashOverlap
A lightweight design for computation-communication overlap.
☆181Updated last week
Hsword / SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆129Updated last year
ovg-project / kvcached
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
☆104Updated this week
mcrl / tccl
Thunder Research Group's Collective Communication Library
☆42Updated 3 months ago
Azure / msccl
Microsoft Collective Communication Library
☆66Updated 11 months ago
eniac / paella
Paella: Low-latency Model Serving with Virtualized GPU Scheduling
☆62Updated last year
microsoft / mscclpp
MSCCL++: A GPU-driven communication stack for scalable AI applications
☆425Updated this week
Bruce-Lee-LY / cuda_hook
Hooked CUDA-related dynamic libraries by using automated code generation tools.
☆166Updated last year
Mellanox / gpu_direct_rdma_access
example code for using DC QP for providing RDMA READ and WRITE operations to remote GPU memory
☆145Updated last year