digital-nomad-cheng / ECE408_Applied_Parallel_Programming
CUDA solutions for the lab assignments in the UIUC-ECE408 Applied Parallel Programming course.
☆13Updated last year
Alternatives and similar repositories for ECE408_Applied_Parallel_Programming:
Users that are interested in ECE408_Applied_Parallel_Programming are comparing it to the libraries listed below
- Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (T…☆64Updated 4 years ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆124Updated 4 years ago
- A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores☆48Updated last year
- Solution of Programming Massively Parallel Processors☆40Updated last year
- Personal Notes for Learning HPC & Parallel Computation [Active Adding New Content]☆61Updated 2 years ago
- ☆47Updated 5 years ago
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆85Updated 2 years ago
- ☆27Updated 8 months ago
- ☆87Updated 10 months ago
- Dissecting NVIDIA GPU Architecture☆88Updated 2 years ago
- CUDA Matrix Multiplication Optimization☆161Updated 7 months ago
- ☆42Updated 9 months ago
- An extension library of WMMA API (Tensor Core API)☆88Updated 7 months ago
- Benchmark Framework for Buddy Projects☆52Updated this week
- ☆129Updated last month
- A Easy-to-understand TensorOp Matmul Tutorial☆316Updated 5 months ago
- GPU Performance Advisor☆64Updated 2 years ago
- ☆48Updated last year
- Examples of CUDA implementations by Cutlass CuTe☆138Updated 2 weeks ago
- ☆98Updated last month
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆107Updated 2 years ago
- GEMM by WMMA (tensor core)☆10Updated 2 years ago
- Xiao's CUDA Optimization Guide [Active Adding New Contents]☆264Updated 2 years ago
- IMPACT GPU Algorithms Teaching Labs☆56Updated last year
- ☆219Updated last week
- Step-by-step optimization of CUDA SGEMM☆285Updated 2 years ago
- Optimize GEMM with tensorcore step by step☆22Updated last year
- CUTLASS and CuTe Examples☆38Updated last month
- ☆14Updated last year