CRobeck / instrument-amdgpu-kernelsLinks
LLVM/MLIR based compiler instrumentation of AMD GPU kernels
☆20Updated 3 months ago
Alternatives and similar repositories for instrument-amdgpu-kernels
Users that are interested in instrument-amdgpu-kernels are comparing it to the libraries listed below
Sorting:
- ☆157Updated this week
- TPP experimentation on MLIR for linear algebra☆137Updated 3 weeks ago
- ☆54Updated 5 years ago
- development repository for the open earth compiler☆80Updated 4 years ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆145Updated last week
- Dissecting NVIDIA GPU Architecture☆109Updated 3 years ago
- ☆64Updated 6 years ago
- ☆38Updated 3 years ago
- tutorials about polyhedral compilation.☆55Updated last week
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆122Updated this week
- ☆109Updated last year
- GPU Performance Advisor☆65Updated 3 years ago
- MLIR Sample dialect☆131Updated 8 months ago
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆118Updated 5 months ago
- Performance Prediction Toolkit for GPUs☆37Updated 3 years ago
- ☆47Updated 4 years ago
- Assembler for NVIDIA Volta and Turing GPUs☆231Updated 3 years ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆82Updated 6 years ago
- Triton to TVM transpiler.☆22Updated last year
- An extension library of WMMA API (Tensor Core API)☆107Updated last year
- A framework that support executing unmodified CUDA source code on non-NVIDIA devices.☆136Updated 9 months ago
- ☆287Updated last month
- Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)☆131Updated 3 years ago
- An MLIR-based toy DL compiler for TVM Relay.☆59Updated 3 years ago
- MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com☆38Updated last year
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆35Updated 5 years ago
- Benchmark Framework for Buddy Projects☆55Updated last month
- assembler for NVIDIA FERMI. Imported from Google Code☆73Updated 10 years ago
- Conversions to MLIR EmitC☆133Updated 10 months ago
- MLIR-based toolkit targeting intel heterogeneous hardware☆48Updated 8 months ago