aws-samples / aws-eks-deep-learning-benchmark
Deep learning benchmark utility and optimization tips on EKS.
☆48Updated 5 years ago
Alternatives and similar repositories for aws-eks-deep-learning-benchmark:
Users that are interested in aws-eks-deep-learning-benchmark are comparing it to the libraries listed below
- Amazon SageMaker operator for Kubernetes☆149Updated last year
- CSI Driver of Amazon FSx for Lustre https://aws.amazon.com/fsx/lustre/☆131Updated 3 months ago
- Deploying EFA in EKS utilizing GPUDirectRDMA where supported☆37Updated 6 months ago
- Train and Deploy Machine Learning Models on Kubernetes using Amazon EKS☆165Updated 5 years ago
- Kubeflow workshop on EKS. Mainly focus on AWS integration examples. Please go check kubeflow website http://kubeflow.org for other exampl…☆98Updated 4 years ago
- AWS virtual gpu device plugin provides capability to use smaller virtual gpus for your machine learning inference workloads☆204Updated last year
- Repository for benchmarking☆78Updated 10 months ago
- EFA/NCCL base AMI build Packer and CodeBuild/Pipeline files. Also base Docker build files to enable EFA/NCCL in containers☆42Updated last year
- Amazon EKS node drainer with AWS Lambda.☆60Updated 5 years ago
- Volume Controller for Kubernetes☆67Updated 2 years ago
- This example shows how to produce multimodal videos with audio using the Kinetics dataset on AWS Trainium and EC2 GPU orchestrated by EKS…☆4Updated 9 months ago
- A tool to extend the capabilities of an EKS cluster☆67Updated 4 years ago
- ☆60Updated 2 years ago
- This is the documentation for AWS Deep Learning AMIs: your one-stop shop for deep learning in the cloud☆46Updated last year
- This repository contains tooling used to build the EKS Distro, and all the projects contained in https://github.com/aws/eks-distro.☆80Updated last week
- AWS AppMesh sidecar injector for EKS.☆56Updated 4 years ago
- The Chef cookbook used to build and bootstrap AWS ParallelCluster☆111Updated 2 weeks ago
- Running High Performance Computing (HPA) applications on EKS using Elastic Fabric Adapter (EFA).☆8Updated 4 years ago
- ☆33Updated 6 years ago
- ACK service controller for Amazon SageMaker☆44Updated last week
- Distributed training using Kubeflow on Amazon EKS☆86Updated 2 weeks ago
- Create, List, Update, Delete Amazon EKS clusters. Deploy and manage software on EKS. Run distributed model training and inference example…☆56Updated last week
- Amazon EKS cluster consumption made easier☆33Updated 3 years ago
- Dynamic training with Apache MXNet reduces cost and time for training deep neural networks by leveraging AWS cloud elasticity and scale. …☆56Updated 2 years ago
- Toolkit for running MXNet training scripts on SageMaker. Dockerfiles used for building SageMaker MXNet Containers are at https://github.c…☆60Updated 2 months ago
- Seldon Core Operator for Kubernetes☆12Updated 5 years ago
- CLENCLI enables you to quickly and predictably create, change, and improve your cloud projects. It is an open source tool that simplifies…☆59Updated 2 years ago
- Fork of NVIDIA device plugin for Kubernetes with support for shared GPUs by declaring GPUs multiple times☆88Updated 2 years ago
- A controller to help manage App Mesh resources for a Kubernetes cluster.☆186Updated last week
- throttling your pods in kubernetes cluster.☆33Updated last year