DGXC Benchmarking provides recipes in ready-to-use templates for evaluating performance of specific AI use cases across hardware and software combinations.
☆70Feb 26, 2026Updated 3 weeks ago
Alternatives and similar repositories for dgxc-benchmarking
Users that are interested in dgxc-benchmarking are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆64Dec 19, 2025Updated 3 months ago
- nv-one-logger enables tracking of GPU application progress over time and can help to identify overhead from workload and cluster ineffici…☆22Nov 6, 2025Updated 4 months ago
- Linux Sysinfo Snapshot☆65Feb 22, 2026Updated last month
- A community driven catalog of tools and products that are useful in the world of high performance computing (HPC)☆11Jul 3, 2025Updated 8 months ago
- ATLAHS: An Application-centric Network Simulator Toolchain for AI, HPC, and Distributed Storage☆73Mar 10, 2026Updated last week
- A Slurm-based HPC workload management environment, driven by Ansible.☆67Updated this week
- Performance tests for multinode NGC.Ready certification☆15Jan 28, 2026Updated last month
- NVIDIA Fleet Command is a hybrid-cloud platform for securely and remotely deploying, managing, and scaling AI across dozens or up to thou…☆14Jul 20, 2022Updated 3 years ago
- Tooling for optimized, validated, and reproducible GPU-accelerated AI runtime in Kubernetes☆129Updated this week
- A Kubernetes Operator to manage Node OS customizations.☆48Updated this week
- Python wrappers for the FirecREST API☆12Mar 16, 2026Updated last week
- A distributed storage benchmark for file systems, object stores & block devices with support for GPUs☆256Mar 8, 2026Updated 2 weeks ago
- Create an Amazon EKS cluster and run a distributed training example☆29Aug 19, 2024Updated last year
- A toolkit for discovering cluster network topology.☆104Updated this week
- Utility for monitoring process, thread, OS and HW resources.☆20Feb 11, 2026Updated last month
- NCX Infra Controller - Hardware Lifecycle Management and multitenant networking☆102Updated this week
- This repository contains the results and code for the MLPerf™ Training v4.0 benchmark.☆12Jun 11, 2024Updated last year
- This Repo collects and organizes the resources you will need to follow a learning path towards Phsyical AI. From starting out with electr…☆47Feb 9, 2026Updated last month
- Multi-GPU communication profiler and visualizer☆39Jun 10, 2024Updated last year
- RPerf: Accurate Latency Measurement Framework for RDMA☆15Sep 24, 2025Updated 5 months ago
- Aries Network Performance Counters Monitoring Library☆11Nov 19, 2020Updated 5 years ago
- The CSCS ReFrame test suite☆16Updated this week
- ☆10Dec 18, 2025Updated 3 months ago
- Prototype of OpenSHMEM for NVIDIA GPUs, developed as part of DoE Design Forward☆25Apr 26, 2018Updated 7 years ago
- Pavilion is a Python 3 (3.6+) based framework for running and analyzing tests targeting HPC systems.☆46Updated this week
- ☆229Feb 23, 2026Updated last month
- Show differences between directory trees☆15Aug 9, 2025Updated 7 months ago
- A small C++ wrapper for managing Linux CPU sets and CPU affinity☆11Dec 11, 2025Updated 3 months ago
- pytorch code examples for measuring the performance of collective communication calls in AI workloads☆19Sep 18, 2025Updated 6 months ago
- Run Slurm on Kubernetes. A Slinky project.☆253Mar 13, 2026Updated last week
- ☆12Sep 15, 2025Updated 6 months ago
- Generate graphviz dot files from InfiniBand topology dumps.☆16Feb 11, 2024Updated 2 years ago
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆76Jul 18, 2025Updated 8 months ago
- A compact and extensible image viewer☆11Jun 22, 2020Updated 5 years ago
- A wrapper around SageMaker ML Lineage Tracking extending ML Lineage to end-to-end ML lifecycles, including additional capabilities around…☆16Oct 14, 2021Updated 4 years ago
- Enables HPC Environment in an OpenStack Cloud☆11Jan 12, 2018Updated 8 years ago
- Bunch of helper files for the Slurm resource manager☆15May 26, 2025Updated 9 months ago
- A new memory mapping interface for efficient direct user-space access to byte-addressable storage, published in MICRO2022.☆15Sep 29, 2022Updated 3 years ago
- A collection of molecular modelling tools for UCSF Chimera☆18Mar 26, 2019Updated 6 years ago