llm-d / llm-d-deployerLinks

Helm charts for llm-d

☆50

Alternatives and similar repositories for llm-d-deployer

Users that are interested in llm-d-deployer are comparing it to the libraries listed below

Sorting:

NVIDIA / k8s-nim-operator
An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.
☆131Updated last week
kubernetes-sigs / inference-perf
GenAI inference performance benchmarking tool
☆105Updated this week
llm-d-incubation / llm-d-infra
llm-d helm charts and deployment examples
☆43Updated 2 weeks ago
NVIDIA / nvkind
☆174Updated this week
kubernetes-sigs / wg-serving
WG Serving
☆30Updated this week
llm-d / llm-d-inference-scheduler
Inference scheduler for llm-d
☆99Updated this week
llm-d / llm-d-kv-cache-manager
Distributed KV cache coordinator
☆78Updated last week
kubernetes-sigs / jobset
JobSet: a k8s native API for distributed ML training and HPC workloads
☆266Updated last week
kubernetes-sigs / dra-example-driver
Example DRA driver that developers can fork and modify to get them started writing their own.
☆94Updated last month
opendatahub-io / caikit-tgis-serving
☆19Updated this week
llm-d / llm-d-benchmark
llm-d benchmark scripts and tooling
☆30Updated this week
kubernetes / dynamic-resource-allocation
☆38Updated last week
project-codeflare / multi-cluster-app-dispatcher
Holistic job manager on Kubernetes
☆116Updated last year
AI-Hypercomputer / inference-benchmark
☆17Updated 4 months ago
schednex-ai / schednex
Smart Kubernetes Scheduling
☆81Updated this week
coreweave / ml-containers
☆37Updated this week
NVIDIA / topograph
A toolkit for discovering cluster network topology.
☆72Updated last week
run-ai / runai-model-streamer
☆257Updated last week
llm-d / llm-d-routing-sidecar
Incubating P/D sidecar for llm-d
☆16Updated 3 weeks ago
llm-d / llm-d-model-service
Simplified model deployment on llm-d
☆27Updated 3 months ago
foundation-model-stack / multi-nic-cni
☆40Updated last month
llmariner / llmariner
Extensible generative AI platform on Kubernetes with OpenAI-compatible APIs.
☆90Updated last week
project-codeflare / instaslice
InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing
☆30Updated 10 months ago
NVIDIA / knavigator
knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.
☆70Updated 3 months ago
kserve / open-inference-protocol
Repository for open inference protocol specification
☆59Updated 5 months ago
cnvrg / metagpu
K8s device plugin for GPU sharing
☆99Updated 2 years ago
kubernetes-sigs / kjob
KJob: Tool for CLI-loving ML researchers
☆39Updated last week
modelpack / model-spec
Cloud Native Artifacial Intelligence Model Format Specification
☆107Updated last week
kubernetes-sigs / gateway-api-inference-extension
Gateway API Inference Extension
☆495Updated this week
cncf / ai-conformance
☆31Updated last week