llm-d / llm-d-inference-simView external linksLinks
A light weight vLLM simulator, for mocking out replicas.
☆85Updated this week
Alternatives and similar repositories for llm-d-inference-sim
Users that are interested in llm-d-inference-sim are comparing it to the libraries listed below
Sorting:
- Distributed KV cache scheduling & offloading libraries☆104Updated this week
- Incubating P/D sidecar for llm-d☆16Nov 13, 2025Updated 3 months ago
- Helm charts for llm-d☆52Jul 22, 2025Updated 6 months ago
- The Volcano Descheduler☆21Jan 24, 2025Updated last year
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆370Updated this week
- Simplified model deployment on llm-d☆28Jul 2, 2025Updated 7 months ago
- ☆18Jun 18, 2025Updated 7 months ago
- The main purpose of runtime copilot is to assist with node runtime management tasks such as configuring registries, upgrading versions, i…☆12May 16, 2023Updated 2 years ago
- llm-d benchmark scripts and tooling☆47Updated this week
- ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!☆288Jan 26, 2026Updated 2 weeks ago
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆28Feb 7, 2026Updated last week
- CPU DRA Driver☆31Updated this week
- Variant optimization autoscaler for distributed inference workloads☆27Feb 6, 2026Updated last week
- Operator for the mutating admission webhook for ClusterResourceOverride☆18Feb 6, 2026Updated last week
- ☆22Dec 21, 2025Updated last month
- Achieve state of the art inference performance with modern accelerators on Kubernetes☆2,465Updated this week
- A stress testing tool for the scheduler in a large-scale scenario.☆16Apr 29, 2024Updated last year
- Test Orchestrator for Performance and Scalability of AI pLatforms☆16Jan 26, 2026Updated 2 weeks ago
- Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling☆161Updated this week
- A Portable Linux-based Firmware for NVMe Computational Storage Devices☆30Jun 10, 2025Updated 8 months ago
- A set of tools for understanding F2FS usage of ZNS devices, which allow for identifying the on-device locations of files and inodes, mapp…☆20Jan 19, 2025Updated last year
- GenAI inference performance benchmarking tool☆145Feb 6, 2026Updated last week
- A high-performance and light-weight router for vLLM large scale deployment☆112Updated this week
- label ALL kubectl, kustomize, and helm objects, inline, without extra steps.(including namespaces and CRDs)☆15Apr 22, 2024Updated last year
- caniuse.com, but for kubernetes☆27Dec 25, 2024Updated last year
- DRANET is a Kubernetes Network Driver that uses Dynamic Resource Allocation (DRA) to deliver high-performance networking for demanding ap…☆161Dec 9, 2025Updated 2 months ago
- The schedule of the seminar☆25Dec 28, 2021Updated 4 years ago
- This repository describes I/O traces of Google storage servers and disks synthesized by Thesios. Thesios synthesizes representative I/O t…☆25Apr 29, 2024Updated last year
- 💫 A lightweight p2p-based cache system for model distributions on Kubernetes. Reframing now to make it an unified cache system with POSI…☆25Dec 6, 2024Updated last year
- WG Serving☆34Dec 15, 2025Updated last month
- Gateway API Inference Extension☆583Updated this week
- A large-scale simulation framework for LLM inference☆530Jul 25, 2025Updated 6 months ago
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆74Jul 18, 2025Updated 6 months ago
- CNI DRA Driver☆39Oct 1, 2025Updated 4 months ago
- ☆47Dec 8, 2025Updated 2 months ago
- LLM serving cluster simulator☆135Apr 25, 2024Updated last year
- 🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.☆35Feb 5, 2026Updated last week
- ☆34Jul 18, 2025Updated 6 months ago
- DPU-Powered File System Virtualization over virtio-fs☆77Sep 17, 2025Updated 4 months ago