NVIDIA / nvkindLinks
☆191Updated 2 weeks ago
Alternatives and similar repositories for nvkind
Users that are interested in nvkind are comparing it to the libraries listed below
Sorting:
- JobSet: a k8s native API for distributed ML training and HPC workloads☆304Updated this week
- Example DRA driver that developers can fork and modify to get them started writing their own.☆117Updated this week
- GenAI inference performance benchmarking tool☆142Updated last week
- K8s device plugin for GPU sharing☆98Updated 2 years ago
- NVIDIA DRA Driver for GPUs☆557Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆146Updated this week
- WG Serving☆34Updated last month
- Cloud Native Artifacial Intelligence Model Format Specification☆175Updated last week
- GPU plugin to the node feature discovery for Kubernetes☆308Updated last year
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆656Updated last week
- ☆282Updated 2 weeks ago
- ☆212Updated this week
- Holistic job manager on Kubernetes☆116Updated last year
- Gateway API Inference Extension☆576Updated this week
- KJob: Tool for CLI-loving ML researchers☆41Updated last month
- Following the same workflows as Kubernetes. Widely used in InftyAI community.☆13Updated 2 months ago
- llm-d helm charts and deployment examples☆48Updated last month
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆74Updated 6 months ago
- All the things to make the scheduler extendable with wasm.☆129Updated 2 months ago
- Kubernetes Work API☆69Updated 3 weeks ago
- Node Resource Interface☆361Updated last week
- NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated compu…☆177Updated this week
- Library for multi-cluster controllers with controller-runtime☆262Updated this week
- Simplified model deployment on llm-d☆28Updated 7 months ago
- The Google Cloud Storage FUSE Container Storage Interface (CSI) Plugin.☆151Updated this week
- Run Slurm on Kubernetes. A Slinky project.☆230Updated this week
- ☆62Updated last year
- CAPK is a provider for Cluster API (CAPI) that allows users to deploy fake, Kubemark-backed machines to their clusters.☆88Updated last week
- This repository hosts the Multi-Cluster Service APIs. Providers can import packages in this repo to ensure their multi-cluster service co…☆253Updated last month
- Smart Kubernetes Scheduling☆82Updated this week