google / gpu-runtime
☆16Updated 5 years ago
Alternatives and similar repositories for gpu-runtime:
Users that are interested in gpu-runtime are comparing it to the libraries listed below
- ☆56Updated last month
- assembler for NVIDIA FERMI. Imported from Google Code☆72Updated 10 years ago
- Tests and benchmarks for cudnn (and in the future, other nvidia libraries)☆53Updated 4 years ago
- GPUDirect Async support for IB Verbs☆110Updated 2 years ago
- Enabling on-the-fly manipulations with LLVM IR code of CUDA sources☆110Updated 2 years ago
- Symbolic Expression and Statement Module for new DSLs☆205Updated 4 years ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆81Updated 5 years ago
- A tool for examining GPU scheduling behavior.☆81Updated 8 months ago
- CUPTI GPU Profiler☆37Updated 6 years ago
- CERE: Codelet Extractor and REplayer☆40Updated last year
- CUDA GDB☆199Updated 2 months ago
- Emulating DMA Engines on GPUs for Performance and Portability☆39Updated 9 years ago
- Flexible GPGPU instrumentation☆86Updated 5 years ago
- Encapsulate the frequently used AVX instructions as independent modules to reduce repeated development workload.☆120Updated last year
- ☆241Updated 2 months ago
- oneAPI Collective Communications Library (oneCCL)☆232Updated 2 weeks ago
- This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branc…☆59Updated 5 months ago
- ☆410Updated this week
- MLIR-based toolkit targeting intel heterogeneous hardware☆39Updated last month
- An Open Source Kepler GPU Assembler☆20Updated 8 years ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆134Updated this week
- ☆141Updated this week
- ☆13Updated 2 years ago
- Fine-grained frequency and voltage transition tests☆20Updated last year
- Benchmarks for auto-vectorization and revectorization, including both hand-vectorized and scalar code☆28Updated 6 years ago
- Automatic virtualization of (general) accelerators.☆42Updated 2 years ago
- TPP experimentation on MLIR for linear algebra☆127Updated this week
- Intel® Data Mover Library (Intel® DML)☆93Updated 3 weeks ago
- ☆34Updated 3 years ago
- code for benchmarking GPU performance based on cublasSgemm and cublasHgemm☆31Updated 2 years ago