NVIDIA / cloudai
CloudAI Benchmark Framework
☆26Updated this week
Related projects: ⓘ
- GPUDirect Async support for IB Verbs☆88Updated last year
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆118Updated 2 weeks ago
- NCCL Profiling Kit☆104Updated 2 months ago
- RCCL Performance Benchmark Tests☆41Updated last week
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆138Updated this week
- Pytorch process group third-party plugin for UCC☆18Updated 5 months ago
- RDMA and SHARP plugins for nccl library☆154Updated this week
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆108Updated 10 months ago
- A command line utility to manage the configuration of a system's high performance network interfaces for RoCE deployments☆26Updated last year
- Magnum IO community repo☆76Updated 3 months ago
- CUPTI GPU Profiler☆36Updated 5 years ago
- NVIDIA DPU OPs collection☆12Updated last year
- GPU Stress Test is a tool to stress the compute engine of NVIDIA Tesla GPU’s by running a BLAS matrix multiply using different data types…☆71Updated last month
- An I/O benchmark for deep Learning applications☆61Updated 2 weeks ago
- ☆35Updated 3 months ago
- Synthesizer for optimal collective communication algorithms☆94Updated 5 months ago
- Repository for MLCommons Chakra schema and tools☆55Updated 2 weeks ago
- Mellanox libibverbs☆52Updated 5 years ago
- ☆22Updated 3 years ago
- oneCCL Bindings for Pytorch*☆83Updated last week
- Repository for MLCommons Chakra schema and tools☆38Updated 8 months ago
- Multi-Instance-GPU profiling tool☆51Updated last year
- A tool for examining GPU scheduling behavior.☆67Updated last month
- Optimized primitives for collective multi-GPU communication☆19Updated 5 months ago
- Bandwidth test for ROCm☆45Updated this week
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆20Updated 3 months ago
- Fine-grained GPU sharing primitives☆139Updated 4 years ago
- ☆33Updated 2 weeks ago
- Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite☆57Updated 6 years ago
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆61Updated 2 years ago