muatik / openmp-examples
openmp examples
☆129Updated 5 years ago
Related projects: ⓘ
- Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )☆55Updated 6 months ago
- The CMake version of cuda_by_example☆141Updated 4 years ago
- OpenMP tutorial☆36Updated 7 years ago
- Source Code for 'Pro TBB: C++ Parallel Programming with Threading Building Blocks' by Michael Voss, Rafael Asenjo, and James Reinders☆168Updated last month
- Implementation of breadth first search on GPU with CUDA Driver API.☆46Updated 3 years ago
- Learn OpenMP examples step by step☆81Updated 3 years ago
- pdf☆85Updated 6 years ago
- Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.☆103Updated 2 years ago
- ☆100Updated 5 months ago
- supplementary material/programming exercises☆71Updated 2 years ago
- ☆382Updated 9 years ago
- ☆189Updated this week
- CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …☆350Updated last year
- Personal Notes for Learning HPC & Parallel Computation [Active Adding New Content]☆56Updated 2 years ago
- ☆63Updated 10 years ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆81Updated 6 months ago
- Google Colab Notebooks for Udacity CS344 - Intro to Parallel Programming☆126Updated 3 years ago
- A tutorial for CUDA&PyTorch☆110Updated last week
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆90Updated 2 years ago
- Code for paper "Engineering a High-Performance GPU B-Tree" accepted to PPoPP 2019☆51Updated 2 years ago
- Matrix Multiplication on GPU using Shared Memory considering Coalescing and Bank Conflicts☆24Updated 2 years ago
- 大规模并行处理器编程实战 第二版答案☆26Updated 2 years ago
- A cross-platform CUDA/C++17 starter project with google test and google benchmark support.☆35Updated last year
- A library of various helper routines and frameworks used by many of the lab's software☆39Updated 4 months ago
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆126Updated 4 years ago
- Exercises and Solutions for "Programming Your GPU with OpenMP: A Hands-On Introduction"☆119Updated 10 months ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆109Updated 4 years ago
- ☆241Updated 3 years ago
- 小彭老师推出 SyCL 2020 课程(施工中,日后会在直播中放出)☆15Updated last year
- Implementation of parallel Breadth First Algorithm for graph traversal using CUDA and C++ language.☆30Updated 4 years ago