ROCm / MIOpen
AMD's Machine Intelligence Library
☆1,095Updated this week
Alternatives and similar repositories for MIOpen:
Users that are interested in MIOpen are comparing it to the libraries listed below
- Next generation BLAS implementation for ROCm platform☆355Updated this week
- HCC is an Open Source, Optimizing C++ Compiler for Heterogeneous Compute currently for the ROCm GPU Computing Platform☆432Updated 4 years ago
- HIPIFY: Convert CUDA to Portable C++ Code☆537Updated this week
- Dockerfiles for the various software layers defined in the ROCm software platform☆439Updated this week
- HIP: C++ Heterogeneous-Compute Interface for Portability☆3,834Updated this week
- TensorFlow ROCm port☆689Updated this week
- AMD's graph optimization engine.☆196Updated this week
- AMD ROCm™ Software - GitHub Home☆4,835Updated this week
- Stretching GPU performance for GEMMs and tensor contractions.☆231Updated this week
- nGraph has moved to OpenVINO☆1,350Updated 4 years ago
- Tuned OpenCL BLAS☆1,071Updated 2 months ago
- Compute Library for Deep Neural Networks (clDNN)☆574Updated 2 years ago
- ☆625Updated this week
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆220Updated this week
- (Deprecated) hipCaffe: the HIP port of Caffe☆124Updated 8 months ago
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆440Updated 2 months ago
- The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.☆1,401Updated this week
- Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices☆845Updated 6 months ago
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆334Updated this week
- FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/☆1,240Updated this week
- oneAPI Deep Neural Network Library (oneDNN)☆3,677Updated this week
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆856Updated this week
- The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs☆1,267Updated 9 months ago
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,690Updated last year
- pocl - Portable Computing Language☆947Updated this week
- ROCm Communication Collectives Library (RCCL)☆289Updated this week
- ROCm Platform Runtime: ROCr a HPC market enhanced HSA based runtime☆233Updated this week
- A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)☆372Updated this week
- Assembler for NVIDIA Maxwell architecture☆963Updated 2 years ago
- A collection of examples for the ROCm software stack☆177Updated this week