aws-neuron / neuronx-nemo-megatronLinks

☆39

Alternatives and similar repositories for neuronx-nemo-megatron

Users that are interested in neuronx-nemo-megatron are comparing it to the libraries listed below

Sorting:

aws-neuron / aws-neuron-parallelcluster-samples
☆23Updated last year
aws-neuron / neuronx-distributed
☆60Updated last week
aws-neuron / nki-samples
☆40Updated 2 weeks ago
aws-neuron / transformers-neuronx
☆112Updated 6 months ago
aws-samples / aws-parallelcluster-megatron
☆14Updated 4 years ago
aws-neuron / aws-neuron-samples
Example code for AWS Neuron SDK developers building inference and training applications
☆148Updated this week
aws-samples / aws-efa-nccl-baseami-pipeline
EFA/NCCL base AMI build Packer and CodeBuild/Pipeline files. Also base Docker build files to enable EFA/NCCL in containers
☆44Updated last year
aws-neuron / aws-neuron-sdk
Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and i…
☆533Updated this week
huggingface / optimum-neuron
Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
☆235Updated last week
NVIDIA / LDDL
Distributed preprocessing and data loading for language datasets
☆39Updated last year
aws-neuron / upstreaming-to-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆19Updated this week
aws-samples / awsome-distributed-training
Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.
☆331Updated this week
aws-neuron / aws-neuron-reference-for-megatron-lm
☆14Updated last year
aws / sagemaker-hyperpod-cli
A CLI tool that helps manage training jobs on the SageMaker HyperPod clusters orchestrated by Amazon EKS
☆27Updated this week
aws-samples / aws-do-eks
Create, List, Update, Delete Amazon EKS clusters. Deploy and manage software on EKS. Run distributed model training and inference example…
☆60Updated this week
NVIDIA-NeMo / Run
A tool to configure, launch and manage your machine learning experiments.
☆176Updated this week
google / saxml
☆142Updated last week
awslabs / ml-io
A high performance data access library for machine learning tasks
☆74Updated last year
HabanaAI / Model-References
Reference models for Intel(R) Gaudi(R) AI Accelerator
☆167Updated last week
awslabs / sagemaker-debugger
Amazon SageMaker Debugger provides functionality to save tensors during training of machine learning jobs and analyze those tensors
☆162Updated last year
aws-samples / awsome-inference
☆51Updated last week
aws-samples / sagemaker-studio-image-build-cli
CLI for building Docker images in SageMaker Studio using AWS CodeBuild.
☆56Updated 3 years ago
aws-samples / aws-samples-for-ray
☆72Updated last year
aws / sagemaker-pytorch-inference-toolkit
Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at h…
☆141Updated 10 months ago
aws-samples / sagemaker-ssh-helper
A helper library to connect into Amazon SageMaker with AWS Systems Manager and SSH (Secure Shell)
☆249Updated last month
aws-samples / sagemaker-trainium-examples
☆19Updated last year
aws-samples / sagemaker-studio-docker-cli-extension
SageMaker Studio Docker CLI Extension
☆13Updated last year
aws-neuron / aws-neuron-eks-samples
☆24Updated 2 months ago
AI-Hypercomputer / jetstream-pytorch
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
☆67Updated 4 months ago
awslabs / s3-connector-for-pytorch
The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access and store data in Amazon S3.
☆172Updated last week