benoitsteiner / tensorflow-xsmmLinks
Improved performance for TensorFlow on Intel hardware.
☆14Updated 6 years ago
Alternatives and similar repositories for tensorflow-xsmm
Users that are interested in tensorflow-xsmm are comparing it to the libraries listed below
Sorting:
- ☆10Updated 2 years ago
- Library for fast image convolution in neural networks on Intel Architecture☆29Updated 7 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Updated 7 years ago
- Caffe: a fast open framework for deep learning. With OpenCL and CUDA support.☆86Updated 6 years ago
- Intel(R) Machine Learning Scaling Library is a library providing an efficient implementation of communication patterns used in deep learn…☆109Updated 2 years ago
- Scientific library for high-precision computations and research☆49Updated 7 years ago
- Input-aware cuBLAS/clBLAS implementation for better performance☆17Updated 2 years ago
- Boda: A C++ Framework for Efficient Experiments in Computer Vision☆64Updated 5 years ago
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- The Operator Vectorization Library, or OVL, is a python productivity library for defining high performance custom operators for the Tenso…☆68Updated 8 years ago
- Proof-of-Concept CNN in Halide☆22Updated 8 years ago
- Compiler toolkit for neuFlow.☆26Updated 11 years ago
- NNVM for ROCm Examples☆19Updated 7 years ago
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 5 years ago
- Sources for OpenCL and CUDA tutorials. http://jlaning.com☆20Updated 9 years ago
- Tools to convert Caffe models to neon's serialization format☆39Updated 2 years ago
- This repository contains the results and code for the MLPerf™ Training v0.5 benchmark.☆35Updated 3 weeks ago
- Deep neural network framework (C/C++/CUDA).☆31Updated 9 years ago
- Code examples for CUDA and OpenACC☆34Updated 9 months ago
- Examples of building probabilistic models with MXNet linear algebra operators☆23Updated 7 years ago
- CNNs in Halide☆23Updated 9 years ago
- XLA integration of Open Neural Network Exchange (ONNX)☆19Updated 6 years ago
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 7 years ago
- Deep neural network framework for multiple GPUs☆33Updated 9 years ago
- ☆16Updated 7 years ago
- MXNet Model Serving☆25Updated 7 years ago
- Catamount is a compute graph analysis tool to load, construct, and modify deep learning models and to symbolically analyze their compute …☆13Updated 4 years ago
- GPU Optimization and Memory Abstraction Framework☆32Updated 5 years ago
- Convolutional neural networks C++ framework with CPU and GPU (CUDA) backends☆178Updated 6 years ago
- An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluations☆17Updated 5 years ago