llm-d / llm-d-inference-schedulerLinks

Inference scheduler for llm-d

☆68

Alternatives and similar repositories for llm-d-inference-scheduler

Users that are interested in llm-d-inference-scheduler are comparing it to the libraries listed below

Sorting:

llm-d / llm-d-kv-cache-manager
Distributed KV cache coordinator
☆43Updated this week
kubernetes-sigs / inference-perf
GenAI inference performance benchmarking tool
☆71Updated this week
kubernetes-sigs / dra-example-driver
Example DRA driver that developers can fork and modify to get them started writing their own.
☆85Updated this week
NVIDIA / topograph
A toolkit for discovering cluster network topology.
☆59Updated last week
llm-d / llm-d-model-service
Simplified model deployment on llm-d
☆27Updated last month
NVIDIA / knavigator
knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.
☆69Updated 2 weeks ago
kubernetes-sigs / jobset
JobSet: a k8s native API for distributed ML training and HPC workloads
☆246Updated this week
kubernetes-sigs / gateway-api-inference-extension
Gateway API Inference Extension
☆415Updated this week
run-ai / fake-gpu-operator
☆130Updated 2 weeks ago
kubernetes-sigs / wg-serving
WG Serving
☆28Updated last week
BaizeAI / kcover
🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.
☆31Updated last week
project-codeflare / multi-cluster-app-dispatcher
Holistic job manager on Kubernetes
☆117Updated last year
containers / nri-plugins
A collection of community maintained NRI plugins
☆86Updated last week
modelpack / modctl
Command-line tools for managing OCI model artifacts, which are bundled based on Model Spec
☆31Updated this week
google / dranet
DraNet is a Kubernetes Network Driver that uses Dynamic Resource Allocation (DRA) to deliver high-performance networking for demanding ap…
☆98Updated this week
InftyAI / llmaz
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
☆228Updated this week
kubernetes-sigs / cni-dra-driver
CNI DRA Driver
☆27Updated 6 months ago
InftyAI / Manta
💫 A lightweight p2p-based cache system for model distributions on Kubernetes. Reframing now to make it an unified cache system with POSI…
☆24Updated 7 months ago
volcano-sh / resource-exporter
Resource Exporter for volcano scheduling, e.g. NUMA-Aware scheduling.
☆17Updated 2 months ago
NVIDIA / go-gpuallocator
Go Abstraction for Allocating NVIDIA GPUs with Custom Policies
☆116Updated last month
kube-queue / kube-queue
☆118Updated 2 years ago
kubernetes-sigs / apiserver-runtime
Libraries for implementing aggregated apiservers
☆92Updated 2 weeks ago
kubernetes-sigs / kube-scheduler-wasm-extension
All the things to make the scheduler extendable with wasm.
☆129Updated last month
intel / platform-aware-scheduling
Enabling Kubernetes to make pod placement decisions with platform intelligence.
☆176Updated 6 months ago
copilot-io / runtime-copilot
The main purpose of runtime copilot is to assist with node runtime management tasks such as configuring registries, upgrading versions, i…
☆12Updated 2 years ago
kerthcet / github-workflow-as-kube
Following the same workflows as Kubernetes. Widely used in InftyAI community.
☆13Updated 3 weeks ago
kubernetes-sigs / lws
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
☆526Updated this week
NVIDIA / k8s-nim-operator
An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.
☆119Updated this week
k8stopologyawareschedwg / resource-topology-exporter
Resource Topology exporter for Topology Aware Scheduler
☆14Updated last month
kubernetes-sigs / work-api
Kubernetes Work API
☆67Updated 2 months ago