KhronosGroup / NNEF-RegistryLinks
Neural Network Exchange Format registry
☆13Updated 8 months ago
Alternatives and similar repositories for NNEF-Registry
Users that are interested in NNEF-Registry are comparing it to the libraries listed below
Sorting:
- Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL☆137Updated 8 years ago
- The NNEF Tools repository contains tools to generate and consume NNEF documents☆233Updated last month
- MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into …☆208Updated this week
- High-Performance Reproducible BLAS using posit arithmetic☆12Updated 3 years ago
- Library for fast image convolution in neural networks on Intel Architecture☆30Updated 8 years ago
- TensorFlow-nGraph bridge☆136Updated 4 years ago
- The repo is obsolete. Use at your own risk.☆12Updated 7 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆114Updated last year
- Codebase associated with the PyTorch compiler tutorial☆47Updated 6 years ago
- Build TVM docker image for production compilation deployments☆12Updated 4 years ago
- ONNX Parser is a tool that automatically generates openvx inference code (CNN) from onnx binary model files.☆18Updated 7 years ago
- CNNs in Halide☆23Updated 10 years ago
- NNVM for ROCm Examples☆19Updated 8 years ago
- Cooperative Primitives for CUDA C++ Kernel Authors. This repository contains CUB PRs from Q4 2019 until Q4 2020.☆22Updated 5 years ago
- Input-aware cuBLAS/clBLAS implementation for better performance☆17Updated 3 years ago
- XLA integration of Open Neural Network Exchange (ONNX)☆19Updated 7 years ago
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 8 years ago
- This repository is the summary of all of our works for the XLA.☆11Updated 8 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Updated 8 years ago
- nGraph™ Backend for ONNX☆42Updated 3 years ago
- Fast matrix multiplication☆31Updated 4 years ago
- Accelerating DNN Convolutional Layers with Micro-batches☆63Updated 5 years ago
- Tools and extensions for CUDA profiling☆67Updated 6 years ago
- ☆26Updated 3 years ago
- MXNet - nGraph integration☆34Updated 4 years ago
- ☆10Updated 3 years ago
- This is a PyTorch implementation of the Scalpel. Node pruning for five benchmark networks and SIMD-aware weight pruning for LeNet-300-100…☆41Updated 7 years ago
- tutorial to optimize GEMM performance on android☆51Updated 9 years ago
- CUDA FFT convolution☆16Updated 10 years ago
- Python Binding to NVRTC☆79Updated last year