suco-gt / HPC-Internships
Supercomputing @ GT has compiled a list of organizations that offer internships and experiences in HPC and applications of HPC.
☆61Updated last year
Alternatives and similar repositories for HPC-Internships:
Users that are interested in HPC-Internships are comparing it to the libraries listed below
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆209Updated 2 months ago
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆130Updated 4 years ago
- NVIDIA tools guide☆102Updated last month
- collection of benchmarks to measure basic GPU capabilities☆296Updated last week
- ☆42Updated 4 years ago
- Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite☆64Updated 6 years ago
- N-Ways to Multi-GPU Programming☆16Updated last year
- ☆48Updated last year
- Benchmark for measuring the performance of sparse and irregular memory access.☆76Updated last week
- Main Book repository for the Parallel and High Performance Computing book, Manning Publications☆192Updated 2 years ago
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆198Updated 2 months ago
- CUDA Matrix Multiplication Optimization☆161Updated 7 months ago
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆49Updated 4 months ago
- ☆9Updated last year
- XSBench: The Monte Carlo Macroscopic Cross Section Lookup Benchmark☆76Updated 11 months ago
- Forked from https://bitbucket.org/berkeleylab/cs-roofline-toolkit/src/master/☆19Updated 5 years ago
- Rodinia benchmark☆16Updated 7 months ago
- Implementation and analysis of five different GPU based SPMV algorithms in CUDA☆38Updated 6 years ago
- Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.☆127Updated 3 years ago
- Source code of the SC '23 paper: "DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multipli…☆25Updated 8 months ago
- grmonty: relativistic Monte Carlo code☆37Updated 3 months ago
- A hierarchical collective communications library with portable optimizations☆29Updated 2 months ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆124Updated 4 years ago
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆69Updated last year
- Some CUDA projects and utility☆29Updated 5 years ago
- ☆27Updated 5 years ago
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆32Updated 3 years ago
- CME 213 Spring 2021☆64Updated 3 years ago
- Some source code about matrix multiplication implementation on CUDA☆35Updated 6 years ago
- Samples demonstrating how to use the Compute Sanitizer Tools and Public API☆75Updated last year