kubernetes-sigs / wg-servingLinks
WG Serving
☆30Updated last week
Alternatives and similar repositories for wg-serving
Users that are interested in wg-serving are comparing it to the libraries listed below
Sorting:
- GenAI inference performance benchmarking tool☆93Updated this week
- Example DRA driver that developers can fork and modify to get them started writing their own.☆88Updated 2 weeks ago
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆124Updated last week
- JobSet: a k8s native API for distributed ML training and HPC workloads☆257Updated this week
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆69Updated last month
- llm-d helm charts and deployment examples☆40Updated this week
- InstaSlice Operator facilitates slicing of accelerators using stable APIs☆44Updated this week
- ☆167Updated this week
- Cloud Native Artifacial Intelligence Model Format Specification☆91Updated last week
- Inference scheduler for llm-d☆86Updated last week
- Following the same workflows as Kubernetes. Widely used in InftyAI community.☆14Updated 2 months ago
- Holistic job manager on Kubernetes☆116Updated last year
- agent-sandbox enables easy management of isolated, stateful, singleton workloads, ideal for use cases like AI agent runtimes.☆37Updated last week
- Model Registry provides a single pane of glass for ML model developers to index and manage models, versions, and ML artifacts metadata. I…☆149Updated this week
- Kubernetes Work API☆68Updated 3 weeks ago
- CNI DRA Driver☆29Updated 7 months ago
- K8s device plugin for GPU sharing☆99Updated 2 years ago
- 🏃🏿♀️🏃🏽♀️🏃🏻♂️🕒CNCF Technical Advisory Group for Runtime☆95Updated 4 months ago
- Simplified model deployment on llm-d☆27Updated 2 months ago
- Gateway API Inference Extension☆471Updated this week
- ☆59Updated last year
- Operator for managing Node Feature Discovery deployment☆71Updated last month
- Distributed KV cache coordinator☆66Updated this week
- Incubating P/D sidecar for llm-d☆15Updated last month
- CAPK is a provider for Cluster API (CAPI) that allows users to deploy fake, Kubemark-backed machines to their clusters.☆79Updated this week
- Smart Kubernetes Scheduling☆81Updated this week
- The main purpose of runtime copilot is to assist with node runtime management tasks such as configuring registries, upgrading versions, i…☆12Updated 2 years ago
- ☆140Updated this week
- A toolkit for discovering cluster network topology.☆65Updated last week
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆561Updated 2 weeks ago