artyom-beilis / pytorch_dlprim
DLPrimitives/OpenCL out of tree backend for pytorch
☆281Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for pytorch_dlprim
- Deep Learning Primitives and Mini-Framework for OpenCL☆174Updated 2 months ago
- chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.☆226Updated this week
- HIPIFY: Convert CUDA to Portable C++ Code☆523Updated this week
- OpenAI Triton backend for Intel® GPUs☆143Updated this week
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆309Updated this week
- Development repository for the Triton language and compiler☆92Updated this week
- build scripts for ROCm☆181Updated 9 months ago
- AMD's graph optimization engine.☆185Updated this week
- ☆58Updated last year
- A collection of examples for the ROCm software stack☆166Updated this week
- Fork of LLVM to support AMD AIEngine processors☆107Updated this week
- 8-bit CUDA functions for PyTorch☆38Updated this week
- ☆228Updated this week
- Next generation BLAS implementation for ROCm platform☆346Updated this week
- ROCm Platform Runtime: ROCr a HPC market enhanced HSA based runtime☆223Updated this week
- ☆98Updated this week
- An implementation of BLAS using the SYCL open standard.☆259Updated last week
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆219Updated this week
- ROCm BLAS marshalling library☆118Updated this week
- A tool which profiles OpenCL devices to find their peak capacities☆409Updated this week
- Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…☆201Updated last week
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆60Updated this week
- This is the AMD-maintained fork of the LLVM git repository. This repository accepts pull requests and issues related to AMD fork-specific…☆122Updated this week
- Stretching GPU performance for GEMMs and tensor contractions.☆220Updated this week
- ☆387Updated 2 months ago
- Kernel Tuner☆286Updated this week
- Tuned OpenCL BLAS☆1,062Updated this week
- ☆314Updated this week
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆95Updated this week
- Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …☆107Updated this week