CoffeeBeforeArch / spring_2020_tutorialLinks
"Hardware, Software, and Compilers! Oh My!" tutorial files
☆16Updated 5 years ago
Alternatives and similar repositories for spring_2020_tutorial
Users that are interested in spring_2020_tutorial are comparing it to the libraries listed below
Sorting:
- Examples for using SYCL on CUDA☆62Updated 3 months ago
- ☆44Updated 4 years ago
- Learn OpenMP examples step by step☆95Updated 4 months ago
- A library to benchmark CUDA code, similar to google benchmark.☆28Updated 4 years ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆52Updated 2 months ago
- SYCL Benchmark Suite☆64Updated 3 months ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Updated 2 months ago
- Kernel Tuning Toolkit☆59Updated 3 weeks ago
- Benchmark for measuring the performance of sparse and irregular memory access.☆78Updated last month
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆105Updated 7 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- Algorithms implemented in CUDA + resources about GPGPU☆56Updated 3 years ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- MagmaDNN: a simple deep learning framework in c++☆49Updated 4 years ago
- The ultimate memory bandwidth benchmark☆50Updated 4 months ago
- MiniAMR Adaptive Mesh Refinement (AMR) Mini-App☆34Updated 6 months ago
- CSR-based SpGEMM on nVidia and AMD GPUs☆46Updated 9 years ago
- Serial and parallel implementations of matrix multiplication☆41Updated 4 years ago
- A task benchmark☆42Updated 10 months ago
- Multiple 1-stencil implementations using nvidia cuda.☆13Updated 7 years ago
- resources pour le cours d'introduction à la programmation des GPUs du mastère spécialisé HPC-AI☆22Updated last year
- ☆29Updated 5 years ago
- Parallel Tasking Library (PTL) - Lightweight C++11 mutilthreading tasking system featuring thread-pool, task-groups, and lock-free task q…☆47Updated 6 months ago
- TLB Benchmarks☆34Updated 7 years ago
- Learn OpenCL step by step.☆135Updated 2 years ago
- My notes on various HPC papers.☆22Updated 2 years ago
- BLAS implementation for Intel FPGA☆78Updated 4 years ago
- This package includes the implementation for four sparse linear algebra kernels: Sparse-Matrix-Vector-Multiplication (SpMV), Sparse-Trian…☆26Updated 5 years ago
- Code for paper "Engineering a High-Performance GPU B-Tree" accepted to PPoPP 2019☆55Updated 2 years ago
- Autonomic Performance Environment for eXascale (APEX)☆48Updated 2 weeks ago