lcpu-club / awesome-rocm
Collections and tutorials for ROCm
☆20Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for awesome-rocm
- 🔥🔥🔥 A collection of some awesome public CUDA, cuBLAS, TensorRT and High Performance Computing (HPC) projects.☆157Updated last month
- ROC_SHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆39Updated last year
- ☆20Updated last year
- Advanced Matrix Extensions (AMX) Guide☆72Updated 2 years ago
- performance engineering☆27Updated 4 months ago
- ROCm Communication Collectives Library (RCCL)☆270Updated this week
- An Awesome list of oneAPI projects☆126Updated 3 months ago
- ☆217Updated last week
- oneAPI Level Zero Conformance & Performance test content☆47Updated this week
- Forked from https://bitbucket.org/berkeleylab/cs-roofline-toolkit/src/master/☆17Updated 5 years ago
- Performance Prediction Toolkit for GPUs☆31Updated 2 years ago
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆188Updated this week
- ☆59Updated this week
- AMD lab notes with code examples to demonstrate use of AMD GPUs☆91Updated 4 months ago
- CUDA GPU Benchmark☆17Updated 4 months ago
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆68Updated 10 months ago
- oneAPI Technical Advisory Board (TAB) Meeting Notes☆72Updated 9 months ago
- ☆11Updated 2 years ago
- LLM Inference analyzer for different hardware platforms☆42Updated this week
- A collection of examples for the ROCm software stack☆167Updated this week
- Examples for HIP☆200Updated 2 weeks ago
- Source code of the PPoPP '22 paper: "TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs" by Y…☆38Updated 6 months ago
- Solution of Programming Massively Parallel Processors☆31Updated 10 months ago
- ☆71Updated this week
- ☆19Updated last week
- ☆24Updated 7 months ago
- ☆80Updated 7 months ago
- Unified Collective Communication Library☆207Updated last week
- GPU Static Modeling using PTX and Deep Structured Learning☆17Updated 4 years ago
- This is the AMD-maintained fork of the LLVM git repository. This repository accepts pull requests and issues related to AMD fork-specific…☆122Updated this week