tutorial to optimize GEMM performance on android
☆51Feb 17, 2016Updated 10 years ago
Alternatives and similar repositories for gemm-android
Users that are interested in gemm-android are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- profiling gemm on android☆10Apr 1, 2016Updated 10 years ago
- Low-precision matrix multiplication☆1,841Jan 29, 2024Updated 2 years ago
- Demos interesting image-in, image-out networks running on both NVIDIA and AMD GPUs, with NNVM☆49Nov 21, 2017Updated 8 years ago
- ☆10Sep 10, 2025Updated 7 months ago
- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters☆21Mar 2, 2016Updated 10 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- GPU Automatically Tuned Linear Algebra Software☆28Sep 1, 2015Updated 10 years ago
- A faster re-implementation of the FAST-9 algorithm (C++, with C bindings available)☆14Feb 1, 2017Updated 9 years ago
- The benchmark of ncnn that is a high-performance neural network inference framework optimized for the mobile platform☆72Mar 8, 2019Updated 7 years ago
- CK-NNTest: collaboratively validating, benchmarking and optimizing neural net operators across platforms, frameworks and datasets☆15Jul 10, 2021Updated 4 years ago
- Kernel Fusion and Runtime Compilation Based on NNVM☆72Nov 21, 2016Updated 9 years ago
- Porting caffe to android platform☆10Jul 16, 2016Updated 9 years ago
- Open single and half precision gemm implementations☆397Apr 2, 2023Updated 3 years ago
- Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL☆137Apr 20, 2017Updated 9 years ago
- Tuned OpenCL BLAS☆1,173Apr 13, 2026Updated 2 weeks ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Face detection with alignment from unconstrained photos☆12Sep 29, 2015Updated 10 years ago
- A program that times various techniques for performing a moving median filter (sometimes called rolling median, or streaming median)☆11Feb 13, 2016Updated 10 years ago
- Train Neuronal networks to automate your home☆19Mar 1, 2023Updated 3 years ago
- This is an read-only mirror of the gem5 simulator. The upstream repository is stored in https://gem5.googlesource.com, code reviews shoul…☆19Aug 21, 2021Updated 4 years ago
- A light-weight deep convolutional neural network for face detection☆13Mar 8, 2019Updated 7 years ago
- ☆16Nov 21, 2017Updated 8 years ago
- CLTune: An automatic OpenCL & CUDA kernel tuner☆185Dec 12, 2022Updated 3 years ago
- Open Source Library for GPU-Accelerated Execution of Trained Deep Convolutional Neural Networks on Android☆543Apr 12, 2017Updated 9 years ago
- Acceleration package for neural networks on multi-core CPUs☆1,704Jun 11, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Deep CNN on Android☆30Feb 26, 2017Updated 9 years ago
- ☆18Oct 31, 2025Updated 6 months ago
- Amalgamation and go binding☆63Nov 11, 2015Updated 10 years ago
- Proof-of-Concept CNN in Halide☆22Aug 4, 2016Updated 9 years ago
- ☆12May 17, 2019Updated 6 years ago
- Portable 128-bit SIMD intrinsics☆59Jul 4, 2023Updated 2 years ago
- a software library containing BLAS functions written in OpenCL☆864Aug 2, 2024Updated last year
- Companion source code for GTC 2014 talk☆11Mar 25, 2014Updated 12 years ago
- NNVM for ROCm Examples☆19Nov 22, 2017Updated 8 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Personal collection of references for high performance mixed precision training.☆41Oct 21, 2019Updated 6 years ago
- The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologi…☆3,139Apr 23, 2026Updated last week
- An Example of MXNet Models Comilation and Deployment with NNVM in C++☆16Apr 25, 2018Updated 8 years ago
- Neural Style Transfer with Caffe2 on your Android phone☆82Mar 28, 2019Updated 7 years ago
- Cross platform (Visual Studio,Xcode,clang,gcc...) testsuite for OpenCL. Based on CMake and LLVM's lit test framework.☆18Dec 10, 2017Updated 8 years ago
- A powerful Laravel storage driver that enables seamless synchronization of files across multiple disks, with an integrated cache disk for…☆15Nov 11, 2025Updated 5 months ago
- Wrap ffmpge api by c++ and register in qml. You can use the qml type VideoItem such as QtMutilMedia.☆14Nov 16, 2015Updated 10 years ago