gcielniak / OpenCL-Tutorials
OpenCL Tutorials
☆47Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for OpenCL-Tutorials
- 大规模并行处理器编程实战 第二版答案☆27Updated 2 years ago
- Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )☆57Updated this week
- ☆64Updated 10 years ago
- Tencent NCNN with added CUDA support☆67Updated 3 years ago
- Algorithms implemented in CUDA + resources about GPGPU☆54Updated 2 years ago
- Learn OpenCL step by step.☆132Updated 2 years ago
- ☆18Updated 3 years ago
- Examples for using SYCL on CUDA☆60Updated 2 weeks ago
- Collection of easy, well-documented and useful OpenCL examples in C++.☆66Updated 2 years ago
- The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Inte…☆16Updated 5 years ago
- ☆15Updated 10 months ago
- AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/Ope…☆55Updated this week
- 作为对《Heterogeneous Computing with OpenCL 2.0》英文版的中 文翻译。☆128Updated 4 years ago
- clone of https://code.google.com/p/opencl-book-samples (there's an official repo here https://github.com/bgaster/opencl-book-samples)☆44Updated 11 years ago
- An extension library of WMMA API (Tensor Core API)☆84Updated 4 months ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆83Updated 9 months ago
- CUDA PTX-ISA Document 中文翻译版☆26Updated 8 months ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆43Updated 10 months ago
- portDNN is a library implementing neural network algorithms written using SYCL☆108Updated 6 months ago
- Swin Transformer C++ Implementation☆54Updated 3 years ago
- Learn OpenMP examples step by step☆87Updated 3 years ago
- ☆103Updated 7 months ago
- CUDA 6大并行计算模式 代码与笔记☆58Updated 4 years ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆117Updated 4 years ago
- Matrix Multiplication on GPU using Shared Memory considering Coalescing and Bank Conflicts☆24Updated 2 years ago
- ☆55Updated last year
- Common libraries for PPL projects☆29Updated last month
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆82Updated last year
- Training material for Nsight developer tools☆129Updated 3 months ago
- Optimize GEMM with tensorcore step by step☆15Updated 11 months ago