aws-samples / eks-efa-examplesLinks
Running High Performance Computing (HPA) applications on EKS using Elastic Fabric Adapter (EFA).
☆8Updated 4 years ago
Alternatives and similar repositories for eks-efa-examples
Users that are interested in eks-efa-examples are comparing it to the libraries listed below
Sorting:
- EFA/NCCL base AMI build Packer and CodeBuild/Pipeline files. Also base Docker build files to enable EFA/NCCL in containers☆43Updated last year
- Scripts to customize AWS ParallelCluster☆24Updated last month
- Deep learning benchmark utility and optimization tips on EKS.☆48Updated 5 years ago
- Monitoring Dashboard for AWS ParallelCluster☆35Updated 9 months ago
- CSI Driver of Amazon FSx for Lustre https://aws.amazon.com/fsx/lustre/☆139Updated last month
- The Chef cookbook used to build and bootstrap AWS ParallelCluster☆110Updated this week
- Deploying EFA in EKS utilizing GPUDirectRDMA where supported☆37Updated 9 months ago
- ☆82Updated 9 months ago
- ☆12Updated last month
- ☆32Updated 2 weeks ago
- A sample integration of AWS services with SLURM☆78Updated 2 months ago
- Amazon SageMaker operator for Kubernetes☆149Updated last year
- Manage AWS ParallelCluster through an easy to use web interface☆66Updated 2 years ago
- ☆48Updated last week
- AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.☆866Updated last week
- Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and i…☆529Updated this week
- Scale-Out Computing on AWS is a solution that helps customers deploy and operate a multiuser environment for computationally intensive wo…☆128Updated last month
- Template scripts to setup Docker Images compatible with running on MNP Batch☆14Updated 6 years ago
- Contains example recipes that demonstrate how to build HPC systems using AWS services and solutions.☆80Updated last week
- ACK service controller for Amazon SageMaker☆48Updated last month
- AWS Libfabric☆41Updated last week
- aws-parallelcluster-node is the python package installed on the Amazon EC2 instances launched as part of AWS ParallelCluster☆64Updated last week
- ☆16Updated last year
- ☆11Updated last month
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆177Updated this week
- Create, List, Update, Delete Amazon EKS clusters. Deploy and manage software on EKS. Run distributed model training and inference example…☆60Updated last month
- Cluster Toolkit is an open-source software offered by Google Cloud which makes it easy for customers to deploy AI/ML and HPC environments…☆272Updated this week
- Train and Deploy Machine Learning Models on Kubernetes using Amazon EKS☆167Updated 5 years ago
- ☆59Updated 2 weeks ago
- The open source version of the AWS ParallelCluster User Guide.☆25Updated 2 years ago