cjmcv / hpc
Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )
☆57Updated last week
Related projects ⓘ
Alternatives and complementary repositories for hpc
- how to design cpu gemm on x86 with avx256, that can beat openblas.☆66Updated 5 years ago
- CUDA 6大并行计算模式 代码与笔记☆58Updated 4 years ago
- ☆93Updated 3 years ago
- a c++/cuda template library for tensor lazy evaluation☆163Updated last year
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆78Updated last year
- Learn OpenCL step by step.☆131Updated 2 years ago
- study of cutlass☆19Updated last week
- The CMake version of cuda_by_example☆145Updated 4 years ago
- openmp examples☆136Updated 5 years ago
- symmetric int8 gemm☆66Updated 4 years ago
- flexible-gemm conv of deepcore☆17Updated 4 years ago
- ☆103Updated 7 months ago
- 大规模并行处理器编程实战 第二版答案