Stefan20162016 / maxas-explained
maxas Scott Grey's maxas assembler sgemm explaining the (for me) missing parts https://github.com/NervanaSystems/maxas
☆13Updated 6 years ago
Alternatives and similar repositories for maxas-explained:
Users that are interested in maxas-explained are comparing it to the libraries listed below
- ☆51Updated 5 years ago
- Kernel Fusion and Runtime Compilation Based on NNVM☆69Updated 8 years ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆78Updated 5 years ago
- Source for Demystifying GPU Microarchitecture through Microbenchmarking☆16Updated last year
- assembler for NVIDIA FERMI. Imported from Google Code☆71Updated 9 years ago
- MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com☆38Updated last year
- ☆40Updated 4 years ago
- ☆48Updated 5 years ago
- Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)☆120Updated 2 years ago
- Flexible GPGPU instrumentation☆86Updated 5 years ago
- A framework that helps implementing swizzle GPU kernels☆41Updated 4 years ago
- Enabling on-the-fly manipulations with LLVM IR code of CUDA sources☆108Updated last year
- Emulating DMA Engines on GPUs for Performance and Portability☆35Updated 9 years ago
- Chai☆42Updated last year
- Implement asm gemm on vega64 for 4096x4096 fp32 matrix☆21Updated 5 years ago
- Haystack is an analytical cache model that given a program computes the number of cache misses.☆44Updated 5 years ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆27Updated 4 months ago
- Bridging polyhedral analysis tools to the MLIR framework☆107Updated last year
- ☆23Updated 5 years ago
- Conversions to MLIR EmitC☆126Updated last month
- Sample programs for the LLVM PTX back-end☆35Updated 9 years ago
- A GPU cache model for research purposes☆26Updated 11 years ago
- CUDAAdvisor: a GPU profiling tool☆48Updated 6 years ago
- a simple end to end example of taking a ML graph (TF2 / PyTorch) and running it on a device [cpu, gpu]☆29Updated 3 years ago
- A Benchmark Suite for Heterogeneous System Computation☆53Updated 2 months ago
- Performance Prediction Toolkit☆51Updated last month
- A Sound and Complete Verification Tool for Warp-Specialized GPU Kernels☆18Updated 9 years ago
- HCC Sample Applications☆13Updated 8 years ago
- GPU Performance Advisor☆63Updated 2 years ago
- Library to plot integer sets and maps☆48Updated 8 years ago