CoffeeBeforeArch / spring_2020_tutorial
"Hardware, Software, and Compilers! Oh My!" tutorial files
☆17Updated 5 years ago
Alternatives and similar repositories for spring_2020_tutorial:
Users that are interested in spring_2020_tutorial are comparing it to the libraries listed below
- Examples for using SYCL on CUDA☆60Updated 3 weeks ago
- Serial and parallel implementations of matrix multiplication☆39Updated 3 years ago
- Algorithms implemented in CUDA + resources about GPGPU☆53Updated 3 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆88Updated last year
- ☆23Updated 2 years ago
- Some CUDA design patterns and a bit of template magic for CUDA☆148Updated last year
- ☆42Updated 4 years ago
- Learn OpenCL step by step.☆131Updated 2 years ago
- MagmaDNN: a simple deep learning framework in c++☆49Updated 4 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆30Updated last month
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆49Updated last year
- My notes on various HPC papers.☆21Updated 2 years ago
- Simple starter code for SYCL and Eigen☆18Updated 7 years ago
- ☆65Updated 10 years ago
- SYCL Benchmark Suite☆60Updated 4 months ago
- tools to create performance and roofline plots from measured data☆58Updated 10 years ago
- ☆22Updated 2 years ago
- Implementation of breadth first search on GPU with CUDA Driver API.☆47Updated 3 years ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- RAJA Performance Suite☆118Updated last week
- BGHT: High-performance static GPU hash tables.☆57Updated 4 months ago
- ☆56Updated 3 weeks ago
- Simple OpenCL Samples that Build with Khronos Headers and Libs☆96Updated this week
- TAU Performance System Public Mirror (Updated every night at midnight, USA Pacific Time)☆39Updated this week
- Advanced Profiling and Analytics for AMD Hardware☆139Updated this week
- SYCL Conformance Tests☆65Updated last week
- Concurrent CPU-GPU Programming using Task Models☆100Updated 5 years ago
- Slides from the "Bits of Architecture" series on YouTube☆21Updated 2 years ago