EFA/NCCL base AMI build Packer and CodeBuild/Pipeline files. Also base Docker build files to enable EFA/NCCL in containers
☆44Sep 19, 2023Updated 2 years ago
Alternatives and similar repositories for aws-efa-nccl-baseami-pipeline
Users that are interested in aws-efa-nccl-baseami-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13May 30, 2025Updated 11 months ago
- ☆14Sep 15, 2025Updated 8 months ago
- Scripts to customize AWS ParallelCluster☆29Sep 5, 2025Updated 8 months ago
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆215May 20, 2026Updated last week
- This is a sample solution for logging EC2 Spot Instance Interruptions, storing them in CloudWatch and S3, and visualizing them with a Clo…☆74Sep 18, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Monitoring Dashboard for AWS ParallelCluster☆40May 18, 2026Updated last week
- ☆15Jun 6, 2025Updated 11 months ago
- 1-Click Cluster Deployment with AWS ParallelCluster☆30Sep 23, 2022Updated 3 years ago
- The Chef cookbook used to build and bootstrap AWS ParallelCluster☆113May 22, 2026Updated last week
- Template scripts to setup Docker Images compatible with running on MNP Batch☆15Apr 9, 2019Updated 7 years ago
- ☆11Jun 29, 2021Updated 4 years ago
- A sample integration of AWS services with SLURM☆82Apr 18, 2025Updated last year
- Deploy and scale distributed python applications on Amazon EKS using Ray☆20Apr 13, 2026Updated last month
- ☆85Dec 17, 2025Updated 5 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This is a repo of HPC workshops that will be used to facilitate on-site engagements, or be used at conferences and summits.☆35May 15, 2018Updated 8 years ago
- Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.☆427May 20, 2026Updated last week
- This is the documentation for AWS Deep Learning AMIs: your one-stop shop for deep learning in the cloud☆45Jun 15, 2023Updated 2 years ago
- This repository contains sample code to help you extend your LSF cluster to the cloud. It provides fully functional examples of how to s…☆16Dec 17, 2025Updated 5 months ago
- ☆22Nov 19, 2025Updated 6 months ago
- ☆44Jun 3, 2024Updated last year
- Some crazy experiments☆35Sep 3, 2025Updated 8 months ago
- Build and run container environment for LFRic☆11Jan 8, 2024Updated 2 years ago
- High performance NCCL plugin for Bagua.☆15Sep 15, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and i…☆605May 22, 2026Updated last week
- Create, List, Update, Delete Amazon EKS clusters. Deploy and manage software on EKS. Run distributed model training and inference example…☆66Updated this week
- AWS Slurm Cluster for EDA Workloads☆28Apr 13, 2026Updated last month
- Natural language processing & computer vision models optimized for AWS☆143Jan 5, 2023Updated 3 years ago
- OCI container images. A Slinky project.☆21May 18, 2026Updated last week
- ☆65Jan 8, 2026Updated 4 months ago
- SageMaker specific extensions to TensorFlow.☆54Jul 23, 2024Updated last year
- Contains reference architecture scripts for running the OpenPiton regression using auto-scaling SLURM cluster.☆24Feb 25, 2026Updated 3 months ago
- MPI Benchmark on AWS HPC cluster☆20Jan 31, 2020Updated 6 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Openfold inference architecture for Amazon EKS☆11Oct 1, 2024Updated last year
- a model of deepfm using keras☆12Apr 2, 2019Updated 7 years ago
- gpu tester detects broken and slow gpus in a cluster☆71Feb 19, 2023Updated 3 years ago
- Particle In Cell code in Julia☆13Oct 14, 2025Updated 7 months ago
- The open source version of the AWS ParallelCluster User Guide.☆25Jun 16, 2023Updated 2 years ago
- Build scripts for PyTorch @ NERSC☆12Dec 27, 2025Updated 5 months ago
- ☆12Jul 6, 2023Updated 2 years ago