codeplaysoftware / cutlass-sycl
A CUTLASS implementation using SYCL
☆22Updated this week
Alternatives and similar repositories for cutlass-sycl
Users that are interested in cutlass-sycl are comparing it to the libraries listed below
Sorting:
- ☆60Updated 5 months ago
- An extension library of WMMA API (Tensor Core API)☆96Updated 10 months ago
- ☆20Updated 3 weeks ago
- ☆143Updated this week
- Dissecting NVIDIA GPU Architecture☆94Updated 2 years ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆83Updated this week
- ☆96Updated last year
- ☆44Updated 4 years ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆134Updated this week
- ☆24Updated last week
- GPU Performance Advisor☆65Updated 2 years ago
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆38Updated 9 months ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆32Updated 4 years ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆42Updated 2 months ago
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆92Updated last month
- A hierarchical collective communications library with portable optimizations☆35Updated 5 months ago
- ☆45Updated this week
- ☆51Updated 5 years ago
- ☆20Updated 2 years ago
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆61Updated 2 months ago
- Assembler for NVIDIA Volta and Turing GPUs☆218Updated 3 years ago
- ☆12Updated last week
- Advanced Profiling and Analytics for AMD Hardware☆154Updated this week
- rocWMMA☆111Updated this week
- ☆34Updated this week
- Ahead of Time (AOT) Triton Math Library☆63Updated this week
- OpenAI Triton backend for Intel® GPUs☆185Updated this week
- ☆18Updated 5 years ago
- ROC profiler library. Profiling with perf-counters and derived metrics.☆145Updated this week
- ☆50Updated last year