llm-d / llm-d-benchmarkLinks
llm-d benchmark scripts and tooling
☆33Updated this week
Alternatives and similar repositories for llm-d-benchmark
Users that are interested in llm-d-benchmark are comparing it to the libraries listed below
Sorting:
- A tool to detect infrastructure issues on cloud native AI systems☆51Updated 2 months ago
- InstaSlice Operator facilitates slicing of accelerators using stable APIs☆47Updated this week
- Cloud Native Benchmarking of Foundation Models☆44Updated 3 months ago
- GenAI inference performance benchmarking tool☆123Updated this week
- llm-d helm charts and deployment examples☆46Updated last month
- Inference scheduler for llm-d☆105Updated this week
- Incubating P/D sidecar for llm-d☆16Updated last week
- Simplified model deployment on llm-d☆27Updated 4 months ago
- Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling☆110Updated this week
- Enabling Kubernetes to make pod placement decisions with platform intelligence.☆176Updated 9 months ago
- Health checks for Azure N- and H-series VMs.☆54Updated last week
- A toolkit for discovering cluster network topology.☆83Updated this week
- Distributed KV cache coordinator☆87Updated this week
- Variant optimization autoscaler for distributed inference workloads☆21Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆136Updated this week
- Gateway API Inference Extension☆524Updated last week
- Example DRA driver that developers can fork and modify to get them started writing their own.☆105Updated 3 weeks ago
- DOCA Platform manages provisioning and service orchestration for Bluefield DPUs☆60Updated this week
- Holistic job manager on Kubernetes☆116Updated last year
- WG Serving☆31Updated last month
- NVIDIA Network Operator☆297Updated this week
- Helm charts for llm-d☆50Updated 3 months ago
- A light weight vLLM simulator, for mocking out replicas.☆58Updated this week
- 🏃🏿♀️🏃🏽♀️🏃🏻♂️🕒CNCF Technical Advisory Group for Runtime☆95Updated 7 months ago
- ☆268Updated this week
- Bridge operator repo☆21Updated 2 months ago
- Repository to demo GPU Sharing with Time Slicing, MPS, MIG and others☆50Updated last year
- InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing☆30Updated 11 months ago
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆71Updated 4 months ago
- NVIDIA NCCL Tests for Distributed Training☆123Updated last week