A light weight vLLM simulator, for mocking out replicas.
☆87Mar 2, 2026Updated this week
Alternatives and similar repositories for llm-d-inference-sim
Users that are interested in llm-d-inference-sim are comparing it to the libraries listed below
Sorting:
- Distributed KV cache scheduling & offloading libraries☆108Updated this week
- Incubating P/D sidecar for llm-d☆16Nov 13, 2025Updated 3 months ago
- Inference scheduler for llm-d☆135Updated this week
- ☆16Apr 15, 2025Updated 10 months ago
- The Volcano Descheduler☆23Jan 24, 2025Updated last year
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆384Updated this week
- Simplified model deployment on llm-d☆28Jul 2, 2025Updated 8 months ago
- ☆18Jun 18, 2025Updated 8 months ago
- The main purpose of runtime copilot is to assist with node runtime management tasks such as configuring registries, upgrading versions, i…☆12May 16, 2023Updated 2 years ago
- llm-d benchmark scripts and tooling☆48Updated this week
- ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!☆293Jan 26, 2026Updated last month
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆29Updated this week
- CPU DRA Driver☆32Feb 27, 2026Updated last week
- Let CI Robot automatically execute commands for your PR/issue in your Github repository, hosting on Github Action does not require your s…☆13Feb 9, 2026Updated 3 weeks ago
- Variant optimization autoscaler for distributed inference workloads☆33Updated this week
- Operator for the mutating admission webhook for ClusterResourceOverride☆18Feb 13, 2026Updated 3 weeks ago
- ☆22Dec 21, 2025Updated 2 months ago
- Achieve state of the art inference performance with modern accelerators on Kubernetes☆2,543Updated this week
- Test Orchestrator for Performance and Scalability of AI pLatforms☆16Feb 27, 2026Updated last week
- Kubernetes APIServer 高性能代理组件,代理 APIServer 的 List 请求,其它类型的请求会直接反向代理到原生 APIServer。 CKube 还额外支持了分页、搜索和索引等功能。 并且,CKube 100% 兼容原生 kubectl 和 ku…☆19Sep 16, 2022Updated 3 years ago
- Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling☆166Updated this week
- A set of tools for understanding F2FS usage of ZNS devices, which allow for identifying the on-device locations of files and inodes, mapp…☆20Jan 19, 2025Updated last year
- A Portable Linux-based Firmware for NVMe Computational Storage Devices☆31Jun 10, 2025Updated 8 months ago
- GenAI inference performance benchmarking tool☆151Feb 27, 2026Updated last week
- An Envoy inspired, ultimate LLM-first gateway for LLM serving and downstream application developers and enterprises☆26Apr 24, 2025Updated 10 months ago
- ZNS Append-only based LSM key-value store☆21Sep 22, 2023Updated 2 years ago
- ☆222Feb 23, 2026Updated last week
- label ALL kubectl, kustomize, and helm objects, inline, without extra steps.(including namespaces and CRDs)☆15Apr 22, 2024Updated last year
- A high-performance and light-weight router for vLLM large scale deployment☆131Updated this week
- caniuse.com, but for kubernetes☆27Dec 25, 2024Updated last year
- DRANET is a Kubernetes Network Driver that uses Dynamic Resource Allocation (DRA) to deliver high-performance networking for demanding ap…☆160Dec 9, 2025Updated 2 months ago
- This repository describes I/O traces of Google storage servers and disks synthesized by Thesios. Thesios synthesizes representative I/O t…☆25Apr 29, 2024Updated last year
- WG Serving☆34Dec 15, 2025Updated 2 months ago
- The schedule of the seminar☆25Dec 28, 2021Updated 4 years ago
- 💫 A lightweight p2p-based cache system for model distributions on Kubernetes. Reframing now to make it an unified cache system with POSI…☆26Dec 6, 2024Updated last year
- Gateway API Inference Extension☆597Updated this week
- A large-scale simulation framework for LLM inference☆545Jul 25, 2025Updated 7 months ago
- Operator for managing Node Feature Discovery deployment☆74Jan 29, 2026Updated last month
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆74Jul 18, 2025Updated 7 months ago