mperlet / matrix_multiplication
Parallel Matrix Multiplication Using OpenMP, Phtreads, and MPI
☆56Updated 2 years ago
Alternatives and similar repositories for matrix_multiplication:
Users that are interested in matrix_multiplication are comparing it to the libraries listed below
- "Hardware, Software, and Compilers! Oh My!" tutorial files☆16Updated 5 years ago
- MPI Tutorial Exercises☆45Updated 11 years ago
- matrix multiplication in CUDA☆124Updated last year
- Introduction to CUDA programming☆116Updated 7 years ago
- IMPACT GPU Algorithms Teaching Labs☆57Updated 2 years ago
- Learn OpenMP examples step by step☆91Updated 3 months ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆32Updated 4 years ago
- Implementation of a simple CNN using CUDA☆68Updated 7 years ago
- ☆34Updated 5 years ago
- ☆43Updated 4 years ago
- Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )☆60Updated last month
- openmp examples☆143Updated 6 years ago
- Sample code from the book "Professional CUDA C Programming"☆35Updated last year
- Serial and parallel implementations of matrix multiplication☆40Updated 4 years ago
- ☆18Updated 5 years ago
- OpenMP tutorial☆38Updated 7 years ago
- SpMV using CUDA☆18Updated 7 years ago
- Benchmark implementation of CosmoFlow in TensorFlow Keras☆21Updated last year
- Examples for HPC course☆39Updated 4 years ago
- CUDA for MNIST training/inference☆40Updated last year
- Deep Learning framework in C++/CUDA that supports symbolic/automatic differentiation, dynamic computation graphs, tensor/matrix operation…☆53Updated 3 years ago
- ☆23Updated 5 years ago
- Programming accelerated applications with CUDA C/C++, enough to be able to begin work accelerating your own CPU-only applications for per…☆94Updated 6 years ago
- This is an archive of materials produced for an introductory class on CUDA programming at Stanford University in 2010☆218Updated 2 years ago
- A demo of Fast Fourier transform in CUDA implementing by cooleytukey and stockham method☆8Updated 7 years ago
- CSR-based SpGEMM on nVidia and AMD GPUs☆45Updated 9 years ago
- CUDA by practice☆125Updated 5 years ago
- ☆27Updated 2 years ago
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆35Updated 9 years ago
- CSC Summer School in High-Performance Computing☆107Updated this week