☆110Jan 16, 2025Updated last year
Alternatives and similar repositories for transformers-neuronx
Users that are interested in transformers-neuronx are comparing it to the libraries listed below
Sorting:
- Example code for AWS Neuron SDK developers building inference and training applications☆158Mar 10, 2026Updated last week
- ☆63Mar 13, 2026Updated last week
- Training and inference on AWS Trainium and Inferentia chips.☆264Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆25Mar 5, 2026Updated 2 weeks ago
- Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and i…☆583Mar 14, 2026Updated last week
- ☆58Feb 10, 2026Updated last month
- ☆22Mar 27, 2023Updated 2 years ago
- Cluster doctor skills☆14Feb 20, 2026Updated last month
- A universal scalable machine learning model deployment solution☆248Mar 12, 2026Updated last week
- Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stac…☆255Apr 11, 2025Updated 11 months ago
- ☆23Nov 18, 2025Updated 4 months ago
- Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.☆403Mar 13, 2026Updated last week
- One stop shop for running AI/ML on AWS.☆1,145Updated this week
- Learn how to use Transformer-based models for named-entity recognition (NER) tasks and how to analyze various model features, constraints…☆15Jun 29, 2022Updated 3 years ago
- ☆23Aug 21, 2025Updated 6 months ago
- A training framework for large-scale language models based on Megatron-Core, the COOM Training Framework is designed to efficiently handl…☆25Nov 14, 2025Updated 4 months ago
- Openfold inference architecture for Amazon EKS☆11Oct 1, 2024Updated last year
- notebooks on langchain and llamaindex experiments☆12Nov 2, 2023Updated 2 years ago
- ☆14Aug 29, 2023Updated 2 years ago
- Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at h…☆142Oct 7, 2024Updated last year
- Foundation Model Evaluations Library☆278Aug 7, 2025Updated 7 months ago
- ☆39Oct 3, 2022Updated 3 years ago
- ☆14Nov 1, 2024Updated last year
- Retrieval-Augmented Generation battle!☆64Updated this week
- ☆32Updated this week
- MLIR-based partitioning system☆173Mar 14, 2026Updated last week
- Google TPU optimizations for transformers models☆136Jan 23, 2026Updated last month
- Home for OctoML PyTorch Profiler☆113Apr 24, 2023Updated 2 years ago
- Tokenflood is a load testing framework for simulating arbitary loads on instruction-tuned LLMs☆44Feb 25, 2026Updated 3 weeks ago
- ☆20Nov 23, 2022Updated 3 years ago
- ☆289Updated this week
- Example of applying CUDA graphs to LLaMA-v2☆12Aug 25, 2023Updated 2 years ago
- ☆14Jan 30, 2026Updated last month
- ☆89Aug 23, 2023Updated 2 years ago
- This repository contains the results and code for the MLPerf™ Training v2.1 benchmark.☆15Aug 9, 2023Updated 2 years ago
- ☆56Jun 26, 2025Updated 8 months ago
- vLLM performance dashboard☆43Apr 26, 2024Updated last year
- YoloV5 on SageMaker, including bring your own container☆18Nov 23, 2020Updated 5 years ago
- ☆15Mar 6, 2024Updated 2 years ago