mlcommons / trainingLinks
Reference implementations of MLPerf™ training benchmarks
☆1,673Updated 2 weeks ago
Alternatives and similar repositories for training
Users that are interested in training are comparing it to the libraries listed below
Sorting:
- Reference implementations of MLPerf™ inference benchmarks☆1,386Updated this week
- Collective communications library with various primitives for multi-machine training.☆1,305Updated last week
- A benchmark framework for Tensorflow☆1,153Updated last year
- FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/☆1,342Updated this week
- Benchmarking Deep Learning operations on different hardware☆1,087Updated 4 years ago
- TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.☆945Updated this week
- Optimized primitives for collective multi-GPU communication☆3,744Updated this week
- oneAPI Deep Neural Network Library (oneDNN)☆3,796Updated this week
- A domain specific language to express machine learning workloads.☆1,759Updated 2 years ago
- ☆583Updated 7 years ago
- nGraph has moved to OpenVINO☆1,348Updated 4 years ago
- Efficient GPU kernels for block-sparse matrix multiplication and convolution☆1,040Updated last year
- common in-memory tensor structure☆1,002Updated 2 weeks ago
- NCCL Tests☆1,125Updated 3 weeks ago
- ☆1,657Updated 6 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆986Updated 8 months ago
- PyTorch extensions for high performance and large scale training.☆3,322Updated last month
- Dive into Deep Learning Compiler☆645Updated 2 years ago
- Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators☆1,541Updated 5 years ago
- A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep lear…☆5,407Updated this week
- Low-precision matrix multiplication☆1,803Updated last year
- HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training☆1,004Updated 2 months ago
- The Tensor Algebra SuperOptimizer for Deep Learning☆714Updated 2 years ago
- PyTorch elastic training☆729Updated 2 years ago
- Mesh TensorFlow: Model Parallelism Made Easier☆1,607Updated last year
- TVM integration into PyTorch☆452Updated 5 years ago
- Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training☆1,796Updated last week
- Make huge neural nets fit in memory☆2,794Updated 5 years ago
- A GPipe implementation in PyTorch☆841Updated 10 months ago
- ☆390Updated 2 years ago