mlcommons / training
Reference implementations of MLPerf™ training benchmarks
☆1,649Updated last month
Alternatives and similar repositories for training:
Users that are interested in training are comparing it to the libraries listed below
- Reference implementations of MLPerf™ inference benchmarks☆1,327Updated this week
- Benchmarking Deep Learning operations on different hardware☆1,081Updated 3 years ago
- Collective communications library with various primitives for multi-machine training.☆1,273Updated this week
- A benchmark framework for Tensorflow☆1,148Updated last year
- A performant and modular runtime for TensorFlow☆759Updated 2 weeks ago
- FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/☆1,271Updated this week
- A domain specific language to express machine learning workloads.☆1,756Updated last year
- Optimized primitives for collective multi-GPU communication☆3,538Updated last month
- ☆577Updated 6 years ago
- Mesh TensorFlow: Model Parallelism Made Easier☆1,604Updated last year
- nGraph has moved to OpenVINO☆1,350Updated 4 years ago
- The Tensor Algebra SuperOptimizer for Deep Learning☆703Updated 2 years ago
- ☆388Updated 2 years ago
- HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training☆976Updated 5 months ago
- TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.☆914Updated this week
- common in-memory tensor structure☆959Updated this week
- ☆372Updated 7 years ago
- NCCL Tests☆1,026Updated last week
- PyTorch elastic training☆730Updated 2 years ago
- Dive into Deep Learning Compiler☆647Updated 2 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆979Updated 5 months ago
- A GPipe implementation in PyTorch☆835Updated 7 months ago
- ☆1,657Updated 6 years ago
- TensorFlow/TensorRT integration☆739Updated last year
- "Multi-Level Intermediate Representation" Compiler Infrastructure☆1,738Updated 3 years ago
- A Python-level JIT compiler designed to make unmodified PyTorch programs faster.☆1,035Updated 10 months ago
- Compiler for Neural Network hardware accelerators☆3,270Updated 10 months ago
- Efficient GPU kernels for block-sparse matrix multiplication and convolution☆1,038Updated last year
- Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training☆1,768Updated this week
- Model analysis tools for TensorFlow☆1,262Updated 3 weeks ago