NVIDIA / dgxc-benchmarkingLinks
DGXC Benchmarking provides recipes in ready-to-use templates for evaluating performance of specific AI use cases across hardware and software combinations.
☆35Updated last week
Alternatives and similar repositories for dgxc-benchmarking
Users that are interested in dgxc-benchmarking are comparing it to the libraries listed below
Sorting:
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆185Updated last week
- RCCL Performance Benchmark Tests☆75Updated 3 weeks ago
- CloudAI Benchmark Framework☆72Updated last week
- Reference implementations of MLPerf™ HPC training benchmarks☆49Updated 6 months ago
- Container plugin for Slurm Workload Manager☆379Updated 3 weeks ago
- NVIDIA NCCL Tests for Distributed Training☆110Updated last week
- An I/O benchmark for deep Learning applications☆90Updated this week
- ☆22Updated this week
- A tool for bandwidth measurements on NVIDIA GPUs.☆527Updated 5 months ago
- Magnum IO community repo☆98Updated 3 weeks ago
- ☆69Updated 7 months ago
- NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs☆575Updated 3 weeks ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆216Updated last week
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆150Updated last week
- ROCm Communication Collectives Library (RCCL)☆364Updated this week
- MIG Partition Editor for NVIDIA GPUs☆212Updated this week
- NCCL Profiling Kit☆143Updated last year
- Unified Collective Communication Library☆275Updated last week
- A hierarchical collective communications library with portable optimizations☆36Updated 9 months ago
- Tools to deploy GPU clusters in the Cloud☆33Updated 2 years ago
- ☆371Updated last year
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆62Updated 2 months ago
- CUDA checkpoint and restore utility☆367Updated 7 months ago
- Microsoft Collective Communication Library☆66Updated 9 months ago
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆121Updated last year
- Azure HPC/AI VM Images☆116Updated last week
- ☆22Updated last month
- pytorch ucc plugin☆23Updated 4 years ago
- A tool to detect infrastructure issues on cloud native AI systems☆47Updated last month
- RDMA and SHARP plugins for nccl library☆203Updated last week