suco-gt / HPC-Internships
Supercomputing @ GT has compiled a list of organizations that offer internships and experiences in HPC and applications of HPC.
☆64Updated last year
Alternatives and similar repositories for HPC-Internships:
Users that are interested in HPC-Internships are comparing it to the libraries listed below
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆253Updated 3 weeks ago
- Example Makefile for CUDA and C++ source files in a standard project layout.☆48Updated 7 years ago
- Rodinia benchmark☆16Updated 9 months ago
- N-Ways to Multi-GPU Programming☆21Updated 2 years ago
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆130Updated 4 years ago
- Source code of the SC '23 paper: "DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multipli…☆26Updated 10 months ago
- Step-by-step optimization of CUDA SGEMM☆308Updated 3 years ago
- Solution of Programming Massively Parallel Processors☆43Updated last year
- 📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software☆31Updated last month
- ☆18Updated 5 years ago
- NVIDIA tools guide☆125Updated 3 months ago
- Source code of the PPoPP '22 paper: "TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs" by Y…☆39Updated 10 months ago
- ☆117Updated 3 weeks ago
- A hierarchical collective communications library with portable optimizations☆33Updated 4 months ago
- ☆29Updated 9 months ago
- collection of benchmarks to measure basic GPU capabilities☆354Updated 2 months ago
- A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores☆51Updated last year
- Performance Prediction Toolkit for GPUs☆37Updated 3 years ago
- performance engineering☆30Updated 9 months ago
- Mirror of http://gitlab.hpcrl.cse.ohio-state.edu/chong/ppopp19_ae, refactoring for understanding☆14Updated 3 years ago
- Implementation and analysis of five different GPU based SPMV algorithms in CUDA☆39Updated 6 years ago
- Learn OpenMP examples step by step☆91Updated 3 months ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆76Updated this week
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆51Updated last month
- ☆149Updated 8 months ago
- An out-of-tree MLIR dialect template.☆101Updated 7 months ago
- ☆12Updated last month
- CUDA Matrix Multiplication Optimization☆179Updated 8 months ago
- Personal Notes for Learning HPC & Parallel Computation [Active Adding New Content]☆63Updated 2 years ago
- COCCL: Compression and precision co-aware collective communication library☆22Updated last month