gcielniak / OpenCL-Tutorials
OpenCL Tutorials
☆46Updated 4 years ago
Related projects: ⓘ
- Learn OpenCL step by step.☆127Updated 2 years ago
- ☆63Updated 10 years ago
- Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )☆55Updated 6 months ago
- 大规模并行处理器编程实战 第二版答案☆26Updated 2 years ago
- CUDA PTX-ISA Document 中文翻译版☆23Updated 6 months ago
- Collection of easy, well-documented and useful OpenCL examples in C++.☆63Updated 2 years ago
- AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/Ope…☆53Updated this week
- Examples for using SYCL on CUDA☆59Updated 2 weeks ago
- An extension library of WMMA API (Tensor Core API)☆81Updated 2 months ago
- 作为对《Heterogeneous Computing with OpenCL 2.0》英文版的中文翻译。☆124Updated 3 years ago
- The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Inte…☆14Updated 5 years ago
- Simple OpenCL Samples that Build with Khronos Headers and Libs☆84Updated last week
- clone of https://code.google.com/p/opencl-book-samples (there's an official repo here https://github.com/bgaster/opencl-book-samples)☆43Updated 11 years ago
- ☆65Updated 5 months ago
- Algorithms implemented in CUDA + resources about GPGPU☆53Updated 2 years ago
- Tencent NCNN with added CUDA support☆67Updated 3 years ago
- CUDA for MNIST training/inference☆37Updated 8 months ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆109Updated 4 years ago
- Training material for Nsight developer tools☆125Updated last month
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆81Updated 6 months ago
- portDNN is a library implementing neural network algorithms written using SYCL☆106Updated 3 months ago
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆78Updated last year
- ☆18Updated 3 years ago
- Optimize GEMM with tensorcore step by step☆11Updated 9 months ago
- pdf☆85Updated 6 years ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆39Updated 8 months ago
- ☆100Updated 5 months ago
- ☆54Updated last year
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆90Updated 2 years ago
- The CMake version of cuda_by_example☆141Updated 4 years ago