spcl / daceLinks
DaCe - Data Centric Parallel Programming
☆566Updated this week
Alternatives and similar repositories for dace
Users that are interested in dace are comparing it to the libraries listed below
Sorting:
- Kernel Tuner☆374Updated this week
- STREAM, for lots of devices written in many programming models☆352Updated 3 months ago
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆211Updated this week
- Unified Collective Communication Library☆280Updated last week
- A code generator for array-based code on CPUs and GPUs☆618Updated this week
- ☆269Updated this week
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆130Updated this week
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆335Updated this week
- collection of benchmarks to measure basic GPU capabilities☆468Updated last month
- Pluto: An automatic polyhedral parallelizer and locality optimizer☆311Updated 3 months ago
- ☆288Updated 2 months ago
- ☆187Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆256Updated this week
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆145Updated this week
- Forked from https://bitbucket.org/berkeleylab/cs-roofline-toolkit/src/master/☆24Updated 6 years ago
- The Foundation for All Legate Libraries☆233Updated this week
- CUDA Kernel Benchmarking Library☆773Updated this week
- Benchmark for measuring the performance of sparse and irregular memory access.☆82Updated 3 months ago
- A Python compiler design toolkit.☆457Updated this week
- Assembler for NVIDIA Volta and Turing GPUs☆234Updated 3 years ago
- This is a set of simple programs that can be used to explore the features of a parallel platform.☆466Updated 3 months ago
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆836Updated 2 months ago
- A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)☆431Updated this week
- GPUOcelot: A dynamic compilation framework for PTX☆217Updated 10 months ago
- Python SYCL bindings and SYCL-based Python Array API library☆119Updated this week
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆134Updated 5 years ago
- A light-weight MPI profiler.☆102Updated 2 months ago
- Rodinia benchmark☆192Updated 2 years ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆165Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆132Updated last week