ROCm / hipCaffe
(Deprecated) hipCaffe: the HIP port of Caffe
☆124Updated 4 months ago
Related projects: ⓘ
- ☆114Updated this week
- ☆116Updated this week
- MIOpenGEMM is now deprecated☆61Updated last year
- The repo is obsolete. Use at your own risk.☆12Updated 6 years ago
- CL Offline Compiler : Compile OpenCL kernels to HSAIL☆49Updated 7 years ago
- Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL☆135Updated 7 years ago
- CLTune: An automatic OpenCL & CUDA kernel tuner☆167Updated last year
- HCC is an Open Source, Optimizing C++ Compiler for Heterogeneous Compute currently for the ROCm GPU Computing Platform☆428Updated 4 years ago
- Easy to run kernels using OpenCL☆183Updated 6 years ago
- OpenCL support for TensorFlow via SYCL☆65Updated 6 years ago
- ☆11Updated this week
- An OpenCL backend for torch.☆289Updated 7 years ago
- Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Juli…☆28Updated 4 years ago
- OpenCL Torch☆147Updated 5 years ago
- A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory☆287Updated 5 years ago
- Library for fast image convolution in neural networks on Intel Architecture☆28Updated 7 years ago
- High Performance Linpack for GPUs (Using OpenCL, CUDA, CAL)☆87Updated 8 years ago
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 7 years ago
- A thin wrapper around miOpen and cuDNN☆37Updated last year
- Next generation BLAS implementation for ROCm platform☆341Updated this week
- an OpenCL based software library containing random number generation functions☆132Updated 2 years ago
- Open single and half precision gemm implementations☆364Updated last year
- portDNN is a library implementing neural network algorithms written using SYCL☆106Updated 3 months ago
- High Efficiency Convolution Kernel for Maxwell GPU Architecture☆134Updated 7 years ago
- This repository contains the results and code for the MLPerf™ Training v0.5 benchmark.☆35Updated last year
- A CUDNN minimal deep learning training code sample using LeNet.☆257Updated last year
- Collection of samples and utilities for using ComputeCpp, Codeplay's SYCL implementation☆322Updated last year
- a software library containing Sparse functions written in OpenCL☆173Updated 4 years ago
- HIP back-end for Thrust that has been replaced by rocThrust☆28Updated last year
- Stretching GPU performance for GEMMs and tensor contractions.☆212Updated this week