llm-d / llm-d-deployerLinks
Helm charts for llm-d
☆42Updated this week
Alternatives and similar repositories for llm-d-deployer
Users that are interested in llm-d-deployer are comparing it to the libraries listed below
Sorting:
- InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing☆28Updated 6 months ago
- GenAI inference performance benchmarking tool☆58Updated this week
- Inference scheduler for llm-d☆56Updated this week
- Simplified model deployment on llm-d☆24Updated 2 weeks ago
- ☆19Updated this week
- Distributed KV cache coordinator☆35Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆114Updated this week
- ☆43Updated 3 months ago
- WG Serving☆27Updated last week
- InstaSlice Operator facilitates slicing of accelerators using stable APIs☆38Updated this week
- AppWrapper controller for Kueue☆14Updated last week
- Repository to demo GPU Sharing with Time Slicing, MPS, MIG and others☆42Updated 8 months ago
- Open Data Hub operator to manage ODH component integrations☆78Updated this week
- ☆37Updated this week
- ☆36Updated this week
- ☆157Updated last week
- Example DRA driver that developers can fork and modify to get them started writing their own.☆74Updated last month
- Model Registry provides a single pane of glass for ML model developers to index and manage models, versions, and ML artifacts metadata. I…☆129Updated this week
- Test Orchestrator for Performance and Scalability of AI pLatforms☆15Updated this week
- Smart Kubernetes Scheduling☆79Updated this week
- A toolkit for discovering cluster network topology.☆54Updated last week
- Extensible generative AI platform on Kubernetes with OpenAI-compatible APIs.☆80Updated 2 weeks ago
- K8s device plugin for GPU sharing☆98Updated 2 years ago
- This repository contains resources, documentation and artifacts describing LLM agents☆14Updated 4 months ago
- ☆221Updated this week
- ☆38Updated last week
- Distributed Model Serving Framework☆170Updated 2 weeks ago
- Artifacts for the Distributed Workloads stack as part of ODH☆31Updated this week
- Operator for managing Node Feature Discovery deployment☆70Updated 3 weeks ago
- Repository for open inference protocol specification☆56Updated last month