intel / cutlass-syclLinks

SYCL based CUTLASS implementation for Intel GPUs

☆39

Alternatives and similar repositories for cutlass-sycl

Users that are interested in cutlass-sycl are comparing it to the libraries listed below

Sorting:

intel / xetla
☆62Updated 9 months ago
ROCm / rocSHMEM
rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
☆117Updated this week
intel / intel-xpu-backend-for-triton
OpenAI Triton backend for Intel® GPUs
☆208Updated this week
ROCm / rocprofiler-sdk
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆31Updated last week
sunlex0717 / DissectingTensorCores
☆108Updated last year
intel / torch-xpu-ops
☆56Updated this week
wmmae / wmma_extension
An extension library of WMMA API (Tensor Core API)
☆105Updated last year
ROCm / rocMLIR
☆150Updated this week
daadaada / turingas
Assembler for NVIDIA Volta and Turing GPUs
☆231Updated 3 years ago
ROCm / amd_matrix_instruction_calculator
A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators
☆115Updated 4 months ago
HabanaAI / Habana_Custom_Kernel
Provides the examples to write and build Habana custom kernels using the HabanaTools
☆22Updated 5 months ago
ROCm / rocmProfileData
☆27Updated last week
ROCm / Tensile
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆246Updated this week
ROCm / rocprofiler-compute
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆162Updated last week
intel / llvm-test-suite
☆20Updated 2 years ago
ROCm / rocWMMA
rocWMMA
☆130Updated this week
intel / pti-gpu
Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…
☆237Updated this week
c3sr / tcu_scope
☆50Updated 6 years ago
libxsmm / tpp-pytorch-extension
Intel® Tensor Processing Primitives extension for Pytorch*
☆17Updated this week
intel / mlir-extensions
Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.
☆143Updated last week
oneapi-src / level-zero-spec
☆19Updated 5 months ago
sjfeng1999 / gpu-arch-microbenchmark
Dissecting NVIDIA GPU Architecture
☆105Updated 3 years ago
ROCm / rccl-tests
RCCL Performance Benchmark Tests
☆76Updated last week
uxlfoundation / oneCCL
oneAPI Collective Communications Library (oneCCL)
☆245Updated this week
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆62Updated 2 months ago
merthidayetoglu / HiCCL
A hierarchical collective communications library with portable optimizations
☆36Updated 9 months ago
ekondis / gpumembench
A GPU benchmark suite for assessing on-chip GPU memory bandwidth
☆106Updated 8 years ago
uuudown / Tartan
Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite
☆66Updated 7 years ago
RRZE-HPC / gpu-benches
collection of benchmarks to measure basic GPU capabilities
☆419Updated 7 months ago
mmperf / mmperf
MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.
☆134Updated 2 years ago