artyom-beilis / dlprimitives
Deep Learning Primitives and Mini-Framework for OpenCL
☆169Updated last week
Related projects: ⓘ
- DLPrimitives/OpenCL out of tree backend for pytorch☆272Updated 2 weeks ago
- HIPIFY: Convert CUDA to Portable C++ Code☆499Updated this week
- Next generation BLAS implementation for ROCm platform☆341Updated this week
- Development repository for the Triton language and compiler☆86Updated this week
- AMD's graph optimization engine.☆183Updated this week
- chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.☆185Updated this week
- ☆221Updated this week
- An implementation of BLAS using the SYCL open standard.☆250Updated 2 weeks ago
- A collection of examples for the ROCm software stack☆149Updated this week
- ☆57Updated last year
- Implementation of OpenCL 3.0 on Vulkan☆342Updated 2 weeks ago
- A tool which profiles OpenCL devices to find their peak capacities☆399Updated 3 months ago
- GPUOcelot: A dynamic compilation framework for PTX☆136Updated 3 months ago
- Stretching GPU performance for GEMMs and tensor contractions.☆213Updated this week
- ROCm Platform Runtime: ROCr a HPC market enhanced HSA based runtime☆213Updated this week
- Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …☆107Updated this week
- Print all known information about all available OpenCL platforms and devices in the system☆311Updated 2 months ago
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆48Updated this week
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆293Updated this week
- A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)☆353Updated last month
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆250Updated this week
- Tuned OpenCL BLAS☆1,046Updated 3 months ago
- Fork of LLVM to support AMD AIEngine processors☆99Updated this week
- ROCm Device Libraries☆99Updated 4 months ago
- ☆80Updated 3 months ago
- OpenCL/SPIR-V implementation of HIP☆104Updated last year
- ☆89Updated this week
- ROCm Parallel Primitives☆156Updated this week
- An implementation of HIP that works on CPUs, across OSes.☆109Updated 6 months ago
- ☆291Updated this week