essentialsofparallelcomputing / EssentialsOfParallelComputing
Main Book repository for the Parallel and High Performance Computing book, Manning Publications
☆188Updated 2 years ago
Alternatives and similar repositories for EssentialsOfParallelComputing:
Users that are interested in EssentialsOfParallelComputing are comparing it to the libraries listed below
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆206Updated last month
- Examples from Programming in Parallel with CUDA☆115Updated last year
- Exercises and Solutions for "Programming Your GPU with OpenMP: A Hands-On Introduction"☆125Updated 2 months ago
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆587Updated 2 months ago
- Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.☆121Updated 2 years ago
- A set of hands-on tutorials for CUDA programming☆205Updated 9 months ago
- ☆503Updated this week
- Step-by-step optimization of CUDA SGEMM☆271Updated 2 years ago
- CUDA Kernel Benchmarking Library☆547Updated last month
- Training material for Nsight developer tools☆142Updated 5 months ago
- ☆402Updated 9 years ago
- Future home of hpc-tutorials.llnl.gov☆229Updated 5 months ago
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆130Updated 4 years ago
- CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …☆373Updated last year
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- Examples for using SYCL on CUDA☆60Updated 2 weeks ago
- CUDA Matrix Multiplication Optimization☆152Updated 5 months ago
- collection of benchmarks to measure basic GPU capabilities☆280Updated 2 weeks ago
- supplementary material/programming exercises☆74Updated 3 years ago
- Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben A…☆258Updated last week
- Samples demonstrating how to use the Compute Sanitizer Tools and Public API☆73Updated last year
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆197Updated last month
- ☆224Updated this week
- This is a mirror of https://gitlab.inria.fr/starpu/starpu where our development happens, but contributions are welcome here too!☆68Updated this week
- Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)☆658Updated 5 months ago
- N-Ways to Multi-GPU Programming☆15Updated last year
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆45Updated 3 months ago
- Next generation LAPACK implementation for ROCm platform☆97Updated this week
- Information about many aspects of high-performance computing. Wiki content moved to ~/docs.☆280Updated this week
- Tutorials for the Kokkos C++ Performance Portability Programming Ecosystem☆306Updated this week