ntuhpc / training-ay1819Links
sample code/text used in NTU HPC Internal Training during AY2018-2019
☆24Updated 6 years ago
Alternatives and similar repositories for training-ay1819
Users that are interested in training-ay1819 are comparing it to the libraries listed below
Sorting:
- CUDA C++ syntax support & snippets for VSCode☆20Updated 4 years ago
- Seminar on selected tools in Computer Science☆25Updated 4 years ago
- My notes on various HPC papers.☆22Updated 2 years ago
- A hybrid partitioner based quantum circuit simulation system on GPU☆47Updated 2 years ago
- ☆29Updated 5 years ago
- Parallel Algorithm Scheduling Library☆106Updated 7 years ago
- An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3☆27Updated 4 years ago
- CMU 15210 Parallel and Sequential Data Structures and Algorithms☆21Updated 9 years ago
- Introduction to CUDA programming☆118Updated 8 years ago
- A sparse BLAS lib supporting multiple backends☆43Updated 3 months ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Updated 2 months ago
- ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs☆83Updated last week
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆108Updated 2 years ago
- IMPACT GPU Algorithms Teaching Labs☆57Updated 2 years ago
- Some source code about matrix multiplication implementation on CUDA☆34Updated 6 years ago
- SJTU HPC 开源项目:Spackenv (Spack ENVironment) switch environments between sysadmin, users and developers.☆22Updated 3 years ago
- Implementation of breadth first search on GPU with CUDA Driver API.☆50Updated 4 years ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆32Updated 4 years ago
- ROCm SPARSE marshalling library☆67Updated this week
- Collections of all codes, past year paper solutions, cheat sheets, and summaries for Computer Science at Nanyang Technological University☆43Updated 9 years ago
- NUMA-aware multi-CPU multi-GPU data transfer benchmarks☆23Updated last year
- HPCG benchmark based on ROCm platform☆37Updated this week
- Some example MPI programs☆96Updated 13 years ago
- This repository stores all of the OLCF vector addition tutorials☆25Updated 11 years ago
- This is a repo which contains some details about how to use OpenCL backend (Xilinx/Intel).☆25Updated 5 years ago
- cuASR: CUDA Algebra for Semirings☆35Updated 2 years ago
- A task benchmark☆42Updated 10 months ago
- This repository contains supplementary source code for the OpenMP(R) API Specification.☆120Updated last week
- ☆17Updated 3 years ago
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆120Updated this week