NVIDIA / nvidia-terraform-modulesLinks
Infrastructure as code for GPU accelerated managed Kubernetes clusters.
☆56Updated 6 months ago
Alternatives and similar repositories for nvidia-terraform-modules
Users that are interested in nvidia-terraform-modules are comparing it to the libraries listed below
Sorting:
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆204Updated last week
- KubeFlow on AWS☆187Updated last month
- MLOps on Amazon EKS☆106Updated last week
- Repository for open inference protocol specification☆59Updated 5 months ago
- AI on GKE is a collection of examples, best-practices, and prebuilt solutions to help build, deploy, and scale AI Platforms on Google Kub…☆324Updated 4 months ago
- This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It …☆47Updated 5 months ago
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆131Updated this week
- Create, List, Update, Delete Amazon EKS clusters. Deploy and manage software on EKS. Run distributed model training and inference example…☆63Updated 2 weeks ago
- AI on EKS - Tested AI/ML for Amazon Elastic Kubernetes Service☆133Updated this week
- This repository aims to showcase how to finetune a FM model in Amazon EKS cluster using, JupyterHub to provision notebooks and craft both…☆49Updated 4 months ago
- ☆111Updated 9 months ago
- ☆264Updated this week
- ☆56Updated last week
- ☆47Updated 9 months ago
- A top-like tool for monitoring GPUs in a cluster☆85Updated last year
- Prepare requirements and deploy Flyte using Helm☆79Updated 7 months ago
- Finetune LLMs on K8s by using Runbooks☆170Updated last year
- Module, Model, and Tensor Serialization/Deserialization☆272Updated 2 months ago
- ☆24Updated 5 months ago
- ACK service controller for Amazon SageMaker☆50Updated 3 weeks ago
- User documentation for KServe.☆109Updated last week
- Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.☆245Updated this week
- markdown docs☆92Updated this week
- Distributed Model Serving Framework☆178Updated last month
- A set of Docker images that include popular frameworks for machine learning, data science and visualization.☆133Updated last week
- Kubeflow workshop on EKS. Mainly focus on AWS integration examples. Please go check kubeflow website http://kubeflow.org for other exampl…☆99Updated 4 years ago
- Amazon SageMaker operator for Kubernetes☆149Updated 2 years ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆110Updated 2 weeks ago
- Controller for ModelMesh☆239Updated 4 months ago
- ☆73Updated last year