suco-gt / HPC-Internships
Supercomputing @ GT has compiled a list of organizations that offer internships and experiences in HPC and applications of HPC.
☆64Updated last year
Alternatives and similar repositories for HPC-Internships:
Users that are interested in HPC-Internships are comparing it to the libraries listed below
- Rodinia benchmark☆16Updated 8 months ago
- grmonty: relativistic Monte Carlo code☆40Updated 4 months ago
- CUDA Matrix Multiplication Optimization☆173Updated 8 months ago
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆51Updated last month
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆247Updated last week
- ☆233Updated this week
- N-Ways to Multi-GPU Programming☆18Updated last year
- Some CUDA projects and utility☆30Updated 5 years ago
- ☆46Updated last year
- collection of benchmarks to measure basic GPU capabilities☆309Updated last month
- An out-of-tree MLIR dialect template.☆100Updated 6 months ago
- NVIDIA tools guide☆118Updated 2 months ago
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆204Updated 3 months ago
- Example Makefile for CUDA and C++ source files in a standard project layout.☆48Updated 7 years ago
- IMPACT GPU Algorithms Teaching Labs☆57Updated last year
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆130Updated 4 years ago
- ☆142Updated 7 months ago
- tutorials about polyhedral compilation.☆31Updated last month
- This repository collects the materials from the course "Foundations of HPC" at Data Science and Scientific Computer, University of Triest…☆14Updated 4 years ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆127Updated 4 years ago
- Solution of Programming Massively Parallel Processors☆42Updated last year
- A Parallel Code Evaluation Benchmark☆25Updated this week
- Step-by-step optimization of CUDA SGEMM☆294Updated 2 years ago
- Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.☆136Updated 3 years ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆64Updated this week
- Fast Matrix Multiplication Implementation in C programming language. This matrix multiplication algorithm is similar to what Numpy uses t…☆31Updated 3 years ago
- A website covering major HPC technologies, designed to welcome contributions.☆70Updated last year
- Personal Notes for Learning HPC & Parallel Computation [Active Adding New Content]☆61Updated 2 years ago
- Advanced Matrix Extensions (AMX) Guide☆83Updated 3 years ago
- A curated list of awesome high performance computing resources☆813Updated last week