run-ai / k8s-launcher
☆12Updated last year
Alternatives and similar repositories for k8s-launcher:
Users that are interested in k8s-launcher are comparing it to the libraries listed below
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆64Updated this week
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆91Updated this week
- Holistic job manager on Kubernetes☆114Updated last year
- ☆30Updated last week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆90Updated this week
- Module, Model, and Tensor Serialization/Deserialization☆220Updated last month
- Repository for open inference protocol specification☆50Updated 8 months ago
- elastic-gpu-agent is a Kubernetes device plugin for GPU resources allocation on node.☆54Updated 2 years ago
- This is a fork/refactoring of the ajmyyra/ambassador-auth-oidc project☆88Updated 11 months ago
- Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes☆332Updated this week
- JobSet: a k8s native API for distributed ML training and HPC workloads☆210Updated this week
- ☆174Updated this week
- K8s device plugin for GPU sharing☆100Updated last year
- Helm charts for the KubeRay project☆43Updated last week
- NVIDIA NCCL Tests for Distributed Training☆85Updated 2 weeks ago
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆354Updated this week
- GenAI inference performance benchmarking tool☆21Updated this week
- elastic-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.☆140Updated 2 years ago
- Simulated large clusters for Kubernetes scheduler validation.☆15Updated 2 years ago
- Elastic Deep Learning Training based on Kubernetes by Leveraging EDL and Volcano☆32Updated last year
- ☆119Updated 2 weeks ago
- Cloud Native Benchmarking of Foundation Models☆24Updated 4 months ago
- Controller for ModelMesh☆226Updated last week
- ☆114Updated 2 years ago
- Distributed Model Serving Framework☆159Updated 2 weeks ago
- A distributed engine for intelligent workload☆27Updated last month
- GPU plugin to the node feature discovery for Kubernetes☆298Updated 10 months ago
- Automatic tuning for ML model deployment on Kubernetes☆81Updated 4 months ago
- A tool to detect infrastructure issues on cloud native AI systems☆28Updated this week
- Deploying EFA in EKS utilizing GPUDirectRDMA where supported☆37Updated 5 months ago