jetpacapp / pi-gemmLinks

A Raspberry Pi GPU-accelerated implementation of the GEMM matrix-multiply function

☆88

Alternatives and similar repositories for pi-gemm

Users that are interested in pi-gemm are comparing it to the libraries listed below

Sorting:

jetpacapp / qpu-asm
An assembler/disassembler for the QPU processors on the Raspberry Pi
☆120Updated 9 years ago
naibaf7 / libdnn
Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL
☆136Updated 8 years ago
Idein / qmkl
Math Kernel Library for VideoCore IV QPU
☆69Updated 7 years ago
chihchun / opencl-docker
Docker images that support different OpenCl Runtime
☆33Updated 8 years ago
moskewcz / boda
Boda: A C++ Framework for Efficient Experiments in Computer Vision
☆64Updated 5 years ago
eBay / maxDNN
High Efficiency Convolution Kernel for Maxwell GPU Architecture
☆134Updated 8 years ago
ROCm / hipCaffe
(Deprecated) hipCaffe: the HIP port of Caffe
☆124Updated last year
codekansas / tinier-nn
Binarized Neural Network TF training code + C matrix / eval library.
☆101Updated 7 years ago
gplhegde / caffepresso
CaffePresso: An Optimized Library for Deep Learning on Embedded Accelerator-based platforms
☆87Updated 9 months ago
spcl / ucudnn
Accelerating DNN Convolutional Layers with Micro-batches
☆63Updated 5 years ago
hannes-brt / cudnn-python-wrappers
Python wrappers for the NVIDIA cuDNN libraries
☆140Updated 8 years ago
doe300 / VC4C
Compiler for the VC4CL OpenCL implementation
☆118Updated 2 years ago
intel / clDNN
Compute Library for Deep Neural Networks (clDNN)
☆574Updated 2 years ago
strin / gemm-android
tutorial to optimize GEMM performance on android
☆51Updated 9 years ago
ville-k / vinn
ViNN - an OpenCL accelerated neural networks library
☆33Updated 9 years ago
okdshin / instant
DNN Inference with CPU, C++, ONNX support: Instant
☆56Updated 6 years ago
jetsonhacks / installTensorFlowTX1
Scripts to install TensorFlow on the NVIDIA Jetson TX1 Development Kit
☆62Updated 7 years ago
GPUOpen-ProfessionalCompute-Libraries / amdovx-core
AMD OpenVX Core -- a sub-module of amdovx-modules:
☆148Updated 6 years ago
hughperkins / EasyCL
Easy to run kernels using OpenCL
☆185Updated 3 months ago
milakov / nnForge
Convolutional neural networks C++ framework with CPU and GPU (CUDA) backends
☆181Updated 6 years ago
NVIDIA / pynvrtc
Python Binding to NVRTC
☆79Updated 9 months ago
ctuning / ck-tensorrt
Collective Knowledge repository for NVIDIA's TensorRT
☆37Updated 4 years ago
NervanaSystems / ngraph-python
Original Python version of Intel® Nervana™ Graph
☆215Updated 2 years ago
benjibc / caffe-rpi
Caffe: a fast open framework for deep learning.
☆43Updated 9 years ago
jolibrain / dd_performances
DeepDetect performance sheet
☆93Updated 5 years ago
KhronosGroup / NNEF-Tools
The NNEF Tools repository contains tools to generate and consume NNEF documents
☆227Updated last week
hughperkins / distro-cl
OpenCL Torch
☆146Updated 6 years ago
intel / ideep
Intel® Optimization for Chainer*, a Chainer module providing numpy like API and DNN acceleration using MKL-DNN.
☆173Updated last week
benanne / theano_fftconv
Convolution op for Theano based on CuFFT using scikits.cuda
☆52Updated 11 years ago
arrayfire / arrayfire-ml
ArrayFire's Machine Learning Library.
☆105Updated 6 years ago