llm-d-incubation / workload-variant-autoscalerLinks
Variant optimization autoscaler for distributed inference workloads
☆21Updated this week
Alternatives and similar repositories for workload-variant-autoscaler
Users that are interested in workload-variant-autoscaler are comparing it to the libraries listed below
Sorting:
- llm-d benchmark scripts and tooling☆33Updated this week
- Inference scheduler for llm-d☆105Updated this week
- A collection of community maintained NRI plugins☆97Updated last week
- A light weight vLLM simulator, for mocking out replicas.☆58Updated this week
- ☆268Updated this week
- GenAI inference performance benchmarking tool☆123Updated last week
- JobSet: a k8s native API for distributed ML training and HPC workloads☆281Updated this week
- ☆32Updated last week
- Example DRA driver that developers can fork and modify to get them started writing their own.☆105Updated 3 weeks ago
- Enabling Kubernetes to make pod placement decisions with platform intelligence.☆176Updated 9 months ago
- A toolkit for discovering cluster network topology.☆83Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆136Updated this week
- Cloud Native Artifacial Intelligence Model Format Specification☆141Updated this week
- Holistic job manager on Kubernetes☆116Updated last year
- ☆30Updated 2 months ago
- The kernel module management operator builds, signs and loads kernel modules in Kubernetes clusters.