NVIDIA / mlperf-common
NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions
☆22Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for mlperf-common
- Benchmarks to capture important workloads.☆28Updated 5 months ago
- RCCL Performance Benchmark Tests☆51Updated last month
- Reference implementations of MLPerf™ HPC training benchmarks☆42Updated 6 months ago
- MLPerf™ logging library☆30Updated this week
- Bandwidth test for ROCm☆49Updated this week
- oneCCL Bindings for Pytorch*☆86Updated 3 weeks ago
- pytorch ucc plugin☆17Updated 3 years ago
- Pytorch process group third-party plugin for UCC☆20Updated 7 months ago
- MPI accelerator-integrated communication extensions☆32Updated last year
- Optimized primitives for collective multi-GPU communication☆21Updated 7 months ago
- XLA integration of Open Neural Network Exchange (ONNX)☆19Updated 6 years ago
- ROCm BLAS marshalling library☆121Updated this week
- ROC_SHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆41Updated last year
- Samples demonstrating how to use the Compute Sanitizer Tools and Public API☆68Updated last year
- A tracing JIT for PyTorch☆17Updated 2 years ago
- ROCm SPARSE marshalling library☆69Updated this week
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆63Updated 2 years ago
- NPBench - A Benchmarking Suite for High-Performance NumPy☆73Updated this week
- AMD SMI☆42Updated this week
- A task benchmark☆40Updated 3 months ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆29Updated 2 months ago
- TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)☆36Updated this week
- ☆14Updated 2 months ago
- Benchmark for measuring the performance of sparse and irregular memory access.☆75Updated this week
- General policies for MLPerf™ including submission rules, coding standards, etc.☆28Updated this week
- ☆20Updated this week
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆101Updated this week
- Magnum IO community repo☆79Updated 6 months ago
- A Deep Learning Meta-Framework and HPC Benchmarking Library☆81Updated 2 years ago
- hipFFT is a FFT marshalling library.☆54Updated this week