amd / ZenDNNLinks
☆126Updated this week
Alternatives and similar repositories for ZenDNN
Users that are interested in ZenDNN are comparing it to the libraries listed below
Sorting:
- AMD's graph optimization engine.☆258Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆253Updated last week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆358Updated this week
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆478Updated this week
- OpenAI Triton backend for Intel® GPUs☆212Updated this week
- oneCCL Bindings for Pytorch*☆102Updated 2 months ago
- Development repository for the Triton language and compiler☆137Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆147Updated this week
- oneAPI Collective Communications Library (oneCCL)☆245Updated this week
- Bandwidth test for ROCm☆67Updated this week
- AI Tensor Engine for ROCm☆287Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆115Updated this week
- Ahead of Time (AOT) Triton Math Library☆80Updated last week
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆134Updated 2 years ago
- ROCm Communication Collectives Library (RCCL)☆389Updated last week
- rocWMMA☆135Updated last week
- ☆271Updated last week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆388Updated this week
- ☆51Updated this week
- ☆63Updated 10 months ago
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆63Updated 3 months ago
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆102Updated 2 months ago
- ☆156Updated this week
- GPUOcelot: A dynamic compilation framework for PTX☆210Updated 8 months ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆84Updated 2 weeks ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆265Updated last week
- ☆422Updated this week
- HIPIFY: Convert CUDA to Portable C++ Code☆625Updated this week
- A collection of examples for the ROCm software stack☆248Updated last week
- MLIR-based partitioning system☆139Updated this week