llm-d/llm-d-benchmark

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/llm-d/llm-d-benchmark)

llm-d / llm-d-benchmark

llm-d benchmark scripts and tooling

☆62

Alternatives and similar repositories for llm-d-benchmark

Users that are interested in llm-d-benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

llm-d / llm-d-model-service
View on GitHub
Simplified model deployment on llm-d
☆29Jul 2, 2025Updated last year
llm-d / llm-d-routing-sidecar
View on GitHub
Incubating P/D sidecar for llm-d
☆17Nov 13, 2025Updated 8 months ago
kubernetes-sigs / inference-perf
View on GitHub
GenAI inference performance benchmarking tool
☆214Updated this week
llm-d / llm-d-kv-cache
View on GitHub
Distributed KV cache scheduling & offloading libraries
☆165Updated this week
llm-d / llm-d-router
View on GitHub
llm-d Router: The intelligent entry point for inference requests
☆272Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
llm-d-incubation / llm-d-modelservice
View on GitHub
helm charts for deploying models with llm-d
☆31Updated this week
llm-d / llm-d-inference-sim
View on GitHub
A lightweight, configurable, and real-time simulator designed to mimic the behavior of vLLM without the need for GPUs or running actual h…
☆172Updated this week
llm-d-incubation / llm-d-infra
View on GitHub
llm-d helm charts and deployment examples
☆59May 1, 2026Updated 2 months ago
llm-d-incubation / llm-d-fast-model-actuation
View on GitHub
Kubernetes controllers for fast model actuation using vLLM sleep/wake and launcher-based model swapping
☆16Updated this week
llm-d-incubation / llm-d-planner
View on GitHub
☆25Updated this week
llm-d / llm-d-workload-variant-autoscaler
View on GitHub
Variant optimization autoscaler for distributed inference workloads
☆52Updated this week
AI-Hypercomputer / inference-benchmark
View on GitHub
☆22Mar 11, 2026Updated 4 months ago
openshift-psap / auto-tuning-vllm
View on GitHub
Auto-tuning for vllm. Getting the best performance out of your LLM deployment (vllm+guidellm+optuna)
☆64Jun 12, 2026Updated last month
kubernetes-sigs / gateway-api-inference-extension
View on GitHub
Gateway API Inference Extension
☆725Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
llm-d / llm-d-inference-payload-processor
View on GitHub
Inference payload processor for llm-d
☆16Updated this week
kfirtoledo / multi-mcp
View on GitHub
☆110Jul 21, 2025Updated last year
agents-first / clawdchan
View on GitHub
Let my Claude talk to yours.
☆30Jul 22, 2026Updated last week
inference-sim / inference-sim
View on GitHub
Inference Platform Simulation
☆21Updated this week
clubanderson / labeler
View on GitHub
label ALL kubectl, kustomize, and helm objects, inline, without extra steps.(including namespaces and CRDs)
☆15Apr 22, 2024Updated 2 years ago
vllm-project / agentic-api
View on GitHub
Stateful API logic for agentic applications using vLLM
☆56Updated this week
cloud-bulldozer / performance-dashboards
View on GitHub
Performance dashboards from the Perf & Scale team
☆20Jul 17, 2026Updated last week
IBM / autopilot
View on GitHub
A tool to detect infrastructure issues on cloud native AI systems
☆54Sep 18, 2025Updated 10 months ago
fmperf-project / fmperf
View on GitHub
Cloud Native Benchmarking of Foundation Models
☆46Jul 31, 2025Updated 11 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ai-dynamo / aiconfigurator
View on GitHub
Offline optimization of your disaggregated Dynamo graph
☆377Updated this week
ubermorgenland / ingress-migration-kit
View on GitHub
Discover ingress-nginx usage and auto-generate Gateway API migration plans before ingress-nginx reaches end-of-life (March 2026).
☆16Nov 26, 2025Updated 8 months ago
torch-spyre / sendnn-inference
View on GitHub
Community maintained hardware plugin for vLLM on Spyre
☆52Updated this week
knoway-dev / knoway
View on GitHub
An Envoy inspired, ultimate LLM-first gateway for LLM serving and downstream application developers and enterprises
☆27Apr 24, 2025Updated last year
sustainablecomputing / caspian
View on GitHub
☆15May 28, 2024Updated 2 years ago
sallyom / otel-kubeadm
View on GitHub
kubeadm with core components instrumented to export OpenTelemetry traces (etcd, kube-apiserver, crio)
☆13Oct 18, 2022Updated 3 years ago
sgl-project / rbg
View on GitHub
A workload for deploying LLM inference services on Kubernetes
☆267Updated this week
linux-system-roles / metrics
View on GitHub
An ansible role which configures metrics collection.
☆17Updated this week
Exgentic / exgentic
View on GitHub
General agent evaluation framework
☆64Updated this week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
d-run / drun-docs
View on GitHub
d.run website
☆18Jul 3, 2026Updated 3 weeks ago
copilot-io / runtime-copilot
View on GitHub
The main purpose of runtime copilot is to assist with node runtime management tasks such as configuring registries, upgrading versions, i…
☆13May 16, 2023Updated 3 years ago
IBM / ado
View on GitHub
A framework for designing, executing and analysing experiment campaigns
☆56Updated this week
IBM / controller-zero-scaler
View on GitHub
Automatically scales Kubernetes controllers to zero
☆16May 30, 2019Updated 7 years ago
IBM / solsa
View on GitHub
Solution Service Architecture
☆26Jun 5, 2024Updated 2 years ago
kubernetes-sigs / wg-serving
View on GitHub
WG Serving
☆38Mar 24, 2026Updated 4 months ago
aigw-project / aigw
View on GitHub
The Intelligent Inference Scheduler for Large-scale Inference Services.
☆68Feb 12, 2026Updated 5 months ago