artyom-beilis / dlprimitives
Deep Learning Primitives and Mini-Framework for OpenCL
☆187Updated 5 months ago
Alternatives and similar repositories for dlprimitives:
Users that are interested in dlprimitives are comparing it to the libraries listed below
- DLPrimitives/OpenCL out of tree backend for pytorch☆317Updated 5 months ago
- HIPIFY: Convert CUDA to Portable C++ Code☆552Updated this week
- chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.☆251Updated this week
- Implementation of OpenCL 3.0 on Vulkan☆375Updated this week
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆263Updated last month
- Next generation BLAS implementation for ROCm platform☆359Updated this week
- Tuned OpenCL BLAS☆1,084Updated 3 months ago
- AMD's graph optimization engine.☆208Updated this week
- GPUOcelot: A dynamic compilation framework for PTX☆166Updated last week
- OpenAI Triton backend for Intel® GPUs☆165Updated this week
- OpenCL/SPIR-V implementation of HIP☆104Updated 2 years ago
- A collection of examples for the ROCm software stack☆186Updated this week
- Stretching GPU performance for GEMMs and tensor contractions.☆233Updated this week
- A small OpenCL benchmark program to measure peak GPU/CPU performance.☆183Updated this week
- Python SYCL bindings and SYCL-based Python Array API library☆109Updated this week
- ☆105Updated 3 months ago
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆349Updated this week
- ☆250Updated this week
- A tool which profiles OpenCL devices to find their peak capacities☆430Updated last month
- ROCm Device Libraries☆97Updated 9 months ago
- Development repository for the Triton language and compiler☆107Updated this week
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆79Updated this week
- ☆117Updated this week
- A prototype CUDA-to-OpenCL source-to-source translator, built on the Clang compiler framework☆193Updated 4 years ago
- ☆60Updated 2 months ago
- The OpenCL ICD Loader project.☆256Updated 3 months ago
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆100Updated this week
- CMake modules used within the ROCm libraries☆64Updated this week
- rocWMMA☆100Updated this week
- Tensor Tiling Library☆34Updated 5 months ago