mlcommons / training_results_v3.1
This repository contains the results and code for the MLPerf™ Training v3.1 benchmark.
☆17Updated 2 weeks ago
Alternatives and similar repositories for training_results_v3.1:
Users that are interested in training_results_v3.1 are comparing it to the libraries listed below
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆86Updated last week
- This repository contains the experimental PyTorch native float8 training UX☆219Updated 5 months ago
- This repository contains the results and code for the MLPerf™ Training v3.0 benchmark.☆12Updated last year
- This repository contains the results and code for the MLPerf™ Training v0.7 benchmark.☆56Updated last year
- This repository contains the results and code for the MLPerf™ Training v4.0 benchmark.☆12Updated 7 months ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆153Updated last month
- ☆279Updated last week
- Applied AI experiments and examples for PyTorch☆216Updated last week
- PyTorch RFCs (experimental)☆131Updated 5 months ago
- Research and development for optimizing transformers☆125Updated 3 years ago
- A library to analyze PyTorch traces.☆325Updated this week
- ☆245Updated 6 months ago
- Distributed preprocessing and data loading for language datasets☆39Updated 9 months ago
- JAX-Toolbox☆279Updated this week
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆218Updated this week
- Implementation of a Transformer, but completely in Triton☆253Updated 2 years ago
- Container plugin for Slurm Workload Manager☆314Updated 2 months ago
- A library for unit scaling in PyTorch☆122Updated 2 months ago
- ☆171Updated last week
- Tools to deploy GPU clusters in the Cloud☆30Updated last year
- Torch Distributed Experimental☆115Updated 5 months ago
- Fast low-bit matmul kernels in Triton☆199Updated last week
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆43Updated 6 months ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆186Updated last week
- ☆158Updated 7 months ago
- jax-triton contains integrations between JAX and OpenAI Triton☆371Updated last week
- extensible collectives library in triton☆77Updated 4 months ago
- ☆294Updated 5 months ago
- ☆157Updated last year
- oneCCL Bindings for Pytorch*☆87Updated 3 weeks ago