scalable-analyses / sme
☆24Updated last month
Alternatives and similar repositories for sme
Users that are interested in sme are comparing it to the libraries listed below
Sorting:
- CPU micro benchmarks☆56Updated 3 weeks ago
- ☆14Updated last year
- A GPU FP32 computation method with Tensor Cores.☆20Updated 2 years ago
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆25Updated 7 months ago
- Microarchitecture diagrams of several CPUs☆35Updated 3 weeks ago
- A framework that support executing unmodified CUDA source code on non-NVIDIA devices.☆127Updated 4 months ago
- ☆11Updated 2 years ago
- ☆44Updated 4 years ago
- The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.☆39Updated 3 years ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆30Updated 7 months ago
- SYCL Reference Manual☆27Updated last year
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Updated last month
- Code samples related to Intel(R) AMX☆37Updated last year
- The University of Bristol HPC Simulation Engine☆96Updated this week
- ☆54Updated 5 years ago
- Updated C version of the Test Suite for Vectorising Compilers☆59Updated last year
- ☆51Updated 5 years ago
- ☆33Updated 3 years ago
- ☆41Updated this week
- BEER determines an ECC code's parity-check matrix based on the uncorrectable errors it can cause. BEER targets Hamming codes that are use…☆19Updated 4 years ago
- Bridging polyhedral analysis tools to the MLIR framework☆110Updated last year
- PTX-EMU is a simple emulator for CUDA program.☆31Updated 3 weeks ago
- ☆30Updated 2 years ago
- Kernel Extension allows to pin thread on a certain cpu core on Apple Silicon machines☆17Updated 5 months ago
- ☆97Updated last week
- Handwritten GEMM using Intel AMX (Advanced Matrix Extension)☆14Updated 4 months ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆134Updated this week
- Triton to TVM transpiler.☆19Updated 7 months ago
- GPU Performance Advisor☆65Updated 2 years ago
- CuPBoP-AMD is a CUDA translator that translates CUDA programs at NVVM IR level to HIP-compatible IR that can run on AMD GPUs.☆36Updated last year