A lightweight, configurable, and real-time simulator designed to mimic the behavior of vLLM without the need for GPUs or running actual heavy models.
☆103Mar 19, 2026Updated last week
Alternatives and similar repositories for llm-d-inference-sim
Users that are interested in llm-d-inference-sim are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Distributed KV cache scheduling & offloading libraries☆117Mar 20, 2026Updated last week
- Incubating P/D sidecar for llm-d☆16Nov 13, 2025Updated 4 months ago
- Inference scheduler for llm-d☆142Mar 19, 2026Updated last week
- llm-d benchmark scripts and tooling☆51Updated this week
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆404Updated this week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Achieve state of the art inference performance with modern accelerators on Kubernetes☆2,657Updated this week
- ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!☆292Jan 26, 2026Updated 2 months ago
- The Volcano Descheduler☆24Jan 24, 2025Updated last year
- ⚖️ CNCF Code of Conduct WG☆17Jan 30, 2025Updated last year
- The main purpose of runtime copilot is to assist with node runtime management tasks such as configuring registries, upgrading versions, i…☆12May 16, 2023Updated 2 years ago
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆29Mar 6, 2026Updated 3 weeks ago
- A set of tools for understanding F2FS usage of ZNS devices, which allow for identifying the on-device locations of files and inodes, mapp…☆20Jan 19, 2025Updated last year
- Automatically scales Kubernetes controllers to zero☆16May 30, 2019Updated 6 years ago
- CPU DRA Driver☆36Mar 20, 2026Updated last week
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Build and deploy Node.js application on Kubernetes☆16Sep 17, 2025Updated 6 months ago
- ZNS Append-only based LSM key-value store☆21Sep 22, 2023Updated 2 years ago
- ☆10Nov 21, 2023Updated 2 years ago
- A Portable Linux-based Firmware for NVMe Computational Storage Devices☆31Jun 10, 2025Updated 9 months ago
- LLM serving cluster simulator☆140Apr 25, 2024Updated last year
- A large-scale simulation framework for LLM inference☆564Jul 25, 2025Updated 8 months ago
- Gateway API Inference Extension☆616Updated this week
- ☆28Jul 29, 2025Updated 7 months ago
- A stress testing tool for the scheduler in a large-scale scenario.☆16Apr 29, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A high-performance and light-weight router for vLLM large scale deployment☆160Updated this week
- Manages vllm-nccl dependency☆17Jun 3, 2024Updated last year
- Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond☆813Mar 17, 2026Updated last week
- Operator for the mutating admission webhook for ClusterResourceOverride☆18Mar 13, 2026Updated 2 weeks ago
- Kubernetes APIServer 高性能代理组件,代理 APIServer 的 List 请求,其它类型的请求会直接反向代理到原生 APIServer。 CKube 还额外支持了分页、搜索和索引等功能。 并且,CKube 100% 兼容原生 kubectl 和 ku…☆19Sep 16, 2022Updated 3 years ago
- tensorflow fork with Salus integration☆12Jan 7, 2022Updated 4 years ago
- DPU-Powered File System Virtualization over virtio-fs☆80Sep 17, 2025Updated 6 months ago
- ☆35Jul 18, 2025Updated 8 months ago
- ☆22Dec 21, 2025Updated 3 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Variant optimization autoscaler for distributed inference workloads☆34Mar 19, 2026Updated last week
- DeepSeek-V3/R1 inference performance simulator☆189Mar 27, 2025Updated last year
- Bridging Immutable and Mutable Abstractions for Distributed Data Analytics☆12May 15, 2019Updated 6 years ago
- Albis: High-Performance File Format for Big Data Systems☆21Jul 12, 2018Updated 7 years ago
- An LLM Mock Server that supports simulating the protocols of all LLM providers.☆11Oct 18, 2025Updated 5 months ago
- NGINX Lua plugin for adaptive concurrency control used to handle overload in services☆14Dec 30, 2022Updated 3 years ago
- ☆232Updated this week