CoffeeBeforeArch / spring_2020_tutorialLinks
"Hardware, Software, and Compilers! Oh My!" tutorial files
☆16Updated 5 years ago
Alternatives and similar repositories for spring_2020_tutorial
Users that are interested in spring_2020_tutorial are comparing it to the libraries listed below
Sorting:
- Examples for using SYCL on CUDA☆62Updated 2 weeks ago
- ☆45Updated 4 years ago
- SYCL Benchmark Suite☆65Updated 3 weeks ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆106Updated 7 years ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- Examples for HIP☆209Updated 7 months ago
- ☆35Updated last year
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆55Updated 3 months ago
- C++ files from the "C++ Crash Course" YouTube series by CoffeeBeforeArch☆103Updated 3 years ago
- Serial and parallel implementations of matrix multiplication☆42Updated 4 years ago
- Learn OpenMP examples step by step☆95Updated 5 months ago
- openmp examples☆143Updated 6 years ago
- Intel® GPU Compute Samples☆108Updated last month
- ROC profiler library. Profiling with perf-counters and derived metrics.☆150Updated last week
- RAJA Performance Suite☆118Updated this week
- SYCL Conformance Tests☆70Updated last week
- CUDA kernel author's tools☆111Updated 3 years ago
- SYCL Open Source Specification☆136Updated this week
- portDNN is a library implementing neural network algorithms written using SYCL☆113Updated last year
- Advanced Profiling and Analytics for AMD Hardware☆159Updated this week
- Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben A…☆275Updated 3 months ago
- Simple OpenCL Samples that Build with Khronos Headers and Libs☆108Updated this week
- Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal, and Rust - all it takes to sum a lot of numbers fast!☆99Updated last month
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- Little OpenMP Library☆163Updated 2 years ago
- Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )☆60Updated 3 months ago
- ☆67Updated 11 years ago
- A unified framework across multiple programming platforms☆41Updated last month
- mallocMC: Memory Allocator for Many Core Architectures☆58Updated 2 months ago
- CUDA by practice☆129Updated 5 years ago