A lightweight, configurable, and real-time simulator designed to mimic the behavior of vLLM without the need for GPUs or running actual heavy models.
☆149Jun 14, 2026Updated this week
Alternatives and similar repositories for llm-d-inference-sim
Users that are interested in llm-d-inference-sim are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Distributed KV cache scheduling & offloading libraries☆156Updated this week
- Helm charts for llm-d☆52Jul 22, 2025Updated 10 months ago
- Incubating P/D sidecar for llm-d☆17Nov 13, 2025Updated 7 months ago
- llm-d Router: The intelligent entry point for inference requests☆220Updated this week
- Simplified model deployment on llm-d☆29Jul 2, 2025Updated 11 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- llm-d benchmark scripts and tooling☆63Updated this week
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆468Updated this week
- ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!☆306Jan 26, 2026Updated 4 months ago
- The Volcano Descheduler☆24Jan 24, 2025Updated last year
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆30Jun 12, 2026Updated last week
- A set of tools for understanding F2FS usage of ZNS devices, which allow for identifying the on-device locations of files and inodes, mapp…☆20Jan 19, 2025Updated last year
- CPU DRA Driver☆51Updated this week
- ZNS Append-only based LSM key-value store☆21Sep 22, 2023Updated 2 years ago
- ☆273Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆10Nov 21, 2023Updated 2 years ago
- Accurate, large-scale, and extensible simulator for LLM inference Systems☆620Jul 25, 2025Updated 10 months ago
- LLM serving cluster simulator☆155Apr 25, 2024Updated 2 years ago
- Gateway API Inference Extension☆693Updated this week
- A Portable Linux-based Firmware for NVMe Computational Storage Devices☆35Jun 10, 2025Updated last year
- ☆28Jul 29, 2025Updated 10 months ago
- High-performance KV cache storage for LLM inference — GPU offloading, SSD caching, and cross-node sharing via RDMA. Works with vLLM and S…☆142Updated this week
- Operator for the mutating admission webhook for ClusterResourceOverride☆19Jun 10, 2026Updated last week
- ☆21Mar 11, 2026Updated 3 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process com…☆548Jun 11, 2026Updated last week
- tensorflow fork with Salus integration☆12Jan 7, 2022Updated 4 years ago
- Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond☆1,070Updated this week
- ☆37Jul 18, 2025Updated 11 months ago
- ☆22May 22, 2026Updated 3 weeks ago
- Variant optimization autoscaler for distributed inference workloads☆44Updated this week
- Adaptive consistency replication with reinforcement learning for large scale globally distributed storage.☆13Sep 29, 2025Updated 8 months ago
- the main repository for the multicluster global hub☆23Jun 11, 2026Updated last week
- DeepSeek-V3/R1 inference performance simulator☆196Mar 27, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- A research group at UCSD CSE focused on Advanced Data Analytics: data management and systems for ML/AI and data science.☆11Feb 27, 2026Updated 3 months ago
- CLI for the Serverless Supercomputer☆25Sep 17, 2025Updated 9 months ago
- Bridging Immutable and Mutable Abstractions for Distributed Data Analytics☆12May 15, 2019Updated 7 years ago
- Albis: High-Performance File Format for Big Data Systems☆21Jul 12, 2018Updated 7 years ago
- A high-performance and light-weight router for vLLM large scale deployment☆268May 6, 2026Updated last month
- 💫 A lightweight p2p-based cache system for model distributions on Kubernetes. Reframing now to make it an unified cache system with POSI…☆27Dec 6, 2024Updated last year
- ☆12Jul 18, 2025Updated 11 months ago