llm-d / llm-d-inference-simLinks
A light weight vLLM simulator, for mocking out replicas.
☆18Updated this week
Alternatives and similar repositories for llm-d-inference-sim
Users that are interested in llm-d-inference-sim are comparing it to the libraries listed below
Sorting:
- Inference scheduler for llm-d☆41Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆36Updated last week
- Artifacts for our NSDI'23 paper TGS☆75Updated 11 months ago
- Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“☆61Updated 11 months ago
- Cloud Native Benchmarking of Foundation Models☆34Updated 2 weeks ago
- Distributed KV cache coordinator☆31Updated last week
- NVIDIA NCCL Tests for Distributed Training☆91Updated last week
- An interference-aware scheduler for fine-grained GPU sharing☆137Updated 4 months ago
- ☆65Updated last month
- ☆25Updated 2 months ago
- ☆13Updated last week
- 🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.☆30Updated 5 months ago
- ☆62Updated 11 months ago
- An efficient GPU resource sharing system with fine-grained control for Linux platforms.☆83Updated last year
- Hooked CUDA-related dynamic libraries by using automated code generation tools.☆156Updated last year
- Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and o…☆103Updated last week
- Intercepting CUDA runtime calls with LD_PRELOAD☆39Updated 11 years ago
- ☆19Updated 6 months ago
- Repository for MLCommons Chakra schema and tools☆99Updated 2 months ago
- Paella: Low-latency Model Serving with Virtualized GPU Scheduling☆58Updated last year
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆121Updated last year
- DeepSeek-V3/R1 inference performance simulator☆129Updated 2 months ago
- Ultra | Ultimate | Unified CCL☆75Updated last week
- Serverless Paper Reading and Discussion☆37Updated 2 years ago
- The criu-coordinator tool aims to enable checkpoint/restore support for distributed applications with CRIU.☆22Updated this week
- An OS kernel module for fast **remote** fork using advanced datacenter networking (RDMA).☆62Updated 3 months ago
- GeminiFS: A Companion File System for GPUs☆29Updated 3 months ago
- ☆30Updated last month
- Stateful LLM Serving☆70Updated 2 months ago
- A library developed by Volcano Engine for high-performance reading and writing of PyTorch model files.☆19Updated 4 months ago