mlcommons / training_results_v1.1
This repository contains the results and code for the MLPerf™ Training v1.1 benchmark.
☆23Updated last year
Alternatives and similar repositories for training_results_v1.1:
Users that are interested in training_results_v1.1 are comparing it to the libraries listed below
- Automated machine learning as an AI-HPC benchmark☆64Updated 2 years ago
- This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.☆37Updated 11 months ago
- This repository contains the results and code for the MLPerf™ Training v2.0 benchmark.☆27Updated 11 months ago
- ☆73Updated 2 years ago
- A tool for examining GPU scheduling behavior.☆71Updated 5 months ago
- A baseline repository of Auto-Parallelism in Training Neural Networks☆142Updated 2 years ago
- ☆36Updated 2 years ago
- examples for tvm schedule API☆98Updated last year
- System for automated integration of deep learning backends.☆48Updated 2 years ago
- Benchmark code for the "Online normalizer calculation for softmax" paper☆64Updated 6 years ago
- Training material for Nsight developer tools☆143Updated 5 months ago
- oneCCL Bindings for Pytorch*☆87Updated 3 weeks ago
- ☆79Updated 2 months ago
- [MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration☆196Updated 2 years ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆86Updated last week
- Synthesizer for optimal collective communication algorithms☆102Updated 9 months ago
- ☆84Updated 9 months ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆122Updated 4 years ago
- ☆141Updated this week
- A home for the final text of all TVM RFCs.☆101Updated 4 months ago
- An Efficient Pipelined Data Parallel Approach for Training Large Model☆73Updated 4 years ago
- Python bindings for NVTX☆66Updated last year
- ☆32Updated last year
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆130Updated 4 years ago
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆88Updated 11 months ago
- ☆70Updated 3 years ago
- ☆67Updated last month
- Microsoft Collective Communication Library☆331Updated last year
- DietCode Code Release☆61Updated 2 years ago
- NCCL Examples from Official NVIDIA NCCL Developer Guide.☆15Updated 6 years ago