awslabs / s3-connector-for-pytorch
The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access and store data in Amazon S3.
☆145Updated this week
Alternatives and similar repositories for s3-connector-for-pytorch:
Users that are interested in s3-connector-for-pytorch are comparing it to the libraries listed below
- ☆167Updated last year
- ☆102Updated last month
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆89Updated this week
- The Triton backend for the PyTorch TorchScript models.☆143Updated last week
- Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.☆218Updated this week
- ☆40Updated last week
- Create, List, Update, Delete Amazon EKS clusters. Deploy and manage software on EKS. Run distributed model training and inference example…☆52Updated 3 weeks ago
- Deploying EFA in EKS utilizing GPUDirectRDMA where supported☆37Updated 4 months ago
- Module, Model, and Tensor Serialization/Deserialization☆212Updated this week
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆345Updated this week
- Example code for AWS Neuron SDK developers building inference and training applications☆135Updated last week
- ☆23Updated 10 months ago
- AWS virtual gpu device plugin provides capability to use smaller virtual gpus for your machine learning inference workloads☆203Updated last year
- CSI Driver of Amazon FSx for Lustre https://aws.amazon.com/fsx/lustre/☆131Updated 3 weeks ago
- KubeFlow on AWS☆178Updated last month
- ☆51Updated last week
- ☆23Updated this week
- This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …☆86Updated this week
- EFA/NCCL base AMI build Packer and CodeBuild/Pipeline files. Also base Docker build files to enable EFA/NCCL in containers☆42Updated last year
- PyTorch per step fault tolerance (actively under development)☆243Updated this week
- ACK service controller for Amazon SageMaker☆43Updated this week
- Distributed Model Serving Framework☆159Updated 4 months ago
- A helper library to connect into Amazon SageMaker with AWS Systems Manager and SSH (Secure Shell)☆231Updated 2 months ago
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆163Updated this week
- Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.☆244Updated this week
- This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It …☆42Updated 2 weeks ago
- Staging area for ongoing enhancements to Ray focused on improving integration with AWS and other Amazon technologies.☆66Updated last year
- A high performance data access library for machine learning tasks☆74Updated last year
- CUDA checkpoint and restore utility☆289Updated 3 weeks ago
- Scalable and Performant Data Loading☆217Updated this week