UoB-HPC / ipu-hpc-cookbookLinks
Useful tutorials and recipes for developers doing low-level work with the Graphcore IPU
☆21Updated 3 years ago
Alternatives and similar repositories for ipu-hpc-cookbook
Users that are interested in ipu-hpc-cookbook are comparing it to the libraries listed below
Sorting:
- Kernel Tuner☆377Updated last week
- ❤️ CUDA/C++ GPU graph analytics simplified.☆31Updated 3 years ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆138Updated this week
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆165Updated last week
- ROCm Communication Collectives Library (RCCL)☆405Updated last week
- Poplar libraries☆121Updated 2 years ago
- ☆19Updated 2 years ago
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆134Updated 5 years ago
- STREAM, for lots of devices written in many programming models☆352Updated 3 months ago
- Experimental projects related to TensorRT☆116Updated this week
- A library of GPU kernels for sparse matrix operations.☆277Updated 5 years ago
- High-performance, GPU-aware communication library☆86Updated last week
- Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018☆73Updated 5 years ago
- A hierarchical collective communications library with portable optimizations☆37Updated last year
- Reference implementations of MLPerf™ HPC training benchmarks☆49Updated 10 months ago
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆341Updated 3 weeks ago
- CUDA Kernel Benchmarking Library☆782Updated 2 weeks ago
- High Performance Linpack for Next-Generation AMD HPC Accelerators☆64Updated 2 weeks ago
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆59Updated this week
- Unified Collective Communication Library☆286Updated this week
- collection of benchmarks to measure basic GPU capabilities☆476Updated 2 months ago
- ☆17Updated 3 years ago
- Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite☆68Updated 7 years ago
- A Micro-benchmarking Tool for HPC Networks☆33Updated 3 months ago
- DaCe - Data Centric Parallel Programming☆568Updated this week
- NCCL Examples from Official NVIDIA NCCL Developer Guide.☆19Updated 7 years ago
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆212Updated 3 weeks ago
- ☆291Updated 3 months ago
- oneAPI Collective Communications Library (oneCCL)☆252Updated last week
- ☆24Updated 2 years ago