kubernetes-sigs/gateway-api-inference-extension

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kubernetes-sigs/gateway-api-inference-extension)

kubernetes-sigs / gateway-api-inference-extension

Gateway API Inference Extension

☆597

Alternatives and similar repositories for gateway-api-inference-extension

Users that are interested in gateway-api-inference-extension are comparing it to the libraries listed below

Sorting:

kubernetes-sigs / lws
View on GitHub
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
☆673Feb 26, 2026Updated last week
kubernetes-sigs / inference-perf
View on GitHub
GenAI inference performance benchmarking tool
☆151Feb 27, 2026Updated last week
kubernetes-sigs / wg-serving
View on GitHub
WG Serving
☆34Dec 15, 2025Updated 2 months ago
InftyAI / llmaz
View on GitHub
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
☆293Jan 26, 2026Updated last month
envoyproxy / ai-gateway
View on GitHub
Manages Unified Access to Generative AI Services built on Envoy Gateway
☆1,399Feb 27, 2026Updated last week
llm-d / llm-d
View on GitHub
Achieve state of the art inference performance with modern accelerators on Kubernetes
☆2,543Updated this week
llm-d / llm-d-inference-scheduler
View on GitHub
Inference scheduler for llm-d
☆135Updated this week
kubernetes-sigs / jobset
View on GitHub
JobSet: a k8s native API for distributed ML training and HPC workloads
☆317Updated this week
kubernetes-sigs / kueue
View on GitHub
Kubernetes-native Job Queueing
☆2,347Updated this week
sgl-project / ome
View on GitHub
Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…
☆384Updated this week
BaizeAI / kcover
View on GitHub
🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.
☆35Updated this week
AI-Hypercomputer / inference-benchmark
View on GitHub
☆18Jun 18, 2025Updated 8 months ago
vllm-project / aibrix
View on GitHub
Cost-efficient and pluggable Infrastructure components for GenAI inference
☆4,650Feb 27, 2026Updated last week
kubeai-project / kubeai
View on GitHub
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-te…
☆1,158Feb 23, 2026Updated last week
InftyAI / Manta
View on GitHub
💫 A lightweight p2p-based cache system for model distributions on Kubernetes. Reframing now to make it an unified cache system with POSI…
☆26Dec 6, 2024Updated last year
sgl-project / rbg
View on GitHub
A workload for deploying LLM inference services on Kubernetes
☆179Updated this week
vllm-project / production-stack
View on GitHub
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
☆2,187Feb 27, 2026Updated last week
kubernetes-sigs / wg-device-management
View on GitHub
Prototypes and experiments for WG Device Management.
☆15Feb 11, 2026Updated 3 weeks ago
knoway-dev / knoway
View on GitHub
An Envoy inspired, ultimate LLM-first gateway for LLM serving and downstream application developers and enterprises
☆26Apr 24, 2025Updated 10 months ago
chaunceyjiang / fake-gpu
View on GitHub
This project is designed to simulate GPU information, making it easier to test scenarios where a GPU is not available.
☆65Jan 9, 2026Updated last month
kserve / kserve
View on GitHub
Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes
☆5,162Updated this week
NVIDIA / KAI-Scheduler
View on GitHub
KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale
☆1,160Updated this week
envoyproxy / gateway
View on GitHub
Manages Envoy Proxy as a Standalone or Kubernetes-based Application Gateway
☆2,537Updated this week
DaoCloud / ckube
View on GitHub
Kubernetes APIServer 高性能代理组件，代理 APIServer 的 List 请求，其它类型的请求会直接反向代理到原生 APIServer。 CKube 还额外支持了分页、搜索和索引等功能。并且，CKube 100% 兼容原生 kubectl 和 ku…
☆19Sep 16, 2022Updated 3 years ago
containerd / nri
View on GitHub
Node Resource Interface
☆366Feb 27, 2026Updated last week
ray-project / kuberay
View on GitHub
A toolkit to run Ray applications on Kubernetes
☆2,355Updated this week
NVIDIA / k8s-dra-driver-gpu
View on GitHub
NVIDIA DRA Driver for GPUs
☆579Updated this week
llm-d / llm-d-kv-cache
View on GitHub
Distributed KV cache scheduling & offloading libraries
☆108Updated this week
kubernetes-sigs / gateway-api
View on GitHub
Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.
☆2,683Updated this week
Project-HAMi / HAMi
View on GitHub
Heterogeneous AI Computing Virtualization Middleware(Project under CNCF)
☆3,047Updated this week
ai-dynamo / dynamo
View on GitHub
A Datacenter Scale Distributed Inference Serving Framework
☆6,154Updated this week
luskits / luscsi
View on GitHub
Provides deploy scripts and CSI for Lustre.
☆14Oct 27, 2025Updated 4 months ago
copilot-io / runtime-copilot
View on GitHub
The main purpose of runtime copilot is to assist with node runtime management tasks such as configuring registries, upgrading versions, i…
☆12May 16, 2023Updated 2 years ago
NVIDIA / gpu-operator
View on GitHub
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
☆2,572Updated this week
NVIDIA / nvkind
View on GitHub
☆195Jan 20, 2026Updated last month
kubernetes-sigs / cloud-provider-kind
View on GitHub
Cloud provider for KIND clusters
☆434Feb 22, 2026Updated last week
schednex-ai / schednex
View on GitHub
Smart Kubernetes Scheduling
☆83Feb 27, 2026Updated last week
d-run / drun-docs
View on GitHub
d.run website
☆15Feb 26, 2026Updated last week
kaito-project / kaito
View on GitHub
Kubernetes AI Toolchain Operator
☆885Feb 27, 2026Updated last week