negativo17 / cuda
NVIDIA Compute Unified Device Architecture Toolkit
☆14Updated last month
Related projects ⓘ
Alternatives and complementary repositories for cuda
- ☆14Updated 5 years ago
- Compiler toolkit for neuFlow.☆26Updated 11 years ago
- DSL for stencils and image processing☆13Updated 8 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- stage the upgrade of hcc-clang to clang ToT☆11Updated 4 years ago
- Python bindings for libNVVM☆37Updated 10 years ago
- Input-aware cuBLAS/clBLAS implementation for better performance☆17Updated 2 years ago
- An ONNX backend using PlaidML☆28Updated 6 years ago
- OpenCL compilation with clang compiler.☆26Updated 5 months ago
- ☆9Updated 5 years ago
- C for Media Runtime☆23Updated 2 years ago
- Torch is a scientific computing framework with wide support for machine learning algorithms. It is easy to use and efficient, thanks to a…☆38Updated 2 years ago
- OpenCL tool to detect buffer overflows in GPU kernels☆20Updated 5 years ago
- Any code related to AMDGPUs☆8Updated 6 years ago
- Ninja-based configuration system☆11Updated 4 years ago
- Enable Polyhedral JIT compilation☆9Updated 6 years ago
- Catamount is a compute graph analysis tool to load, construct, and modify deep learning models and to symbolically analyze their compute …☆12Updated 3 years ago
- ViNN - an OpenCL accelerated neural networks library☆33Updated 8 years ago
- Custom fork containing our own python backend for integration into neon☆15Updated last year
- MIOpenGEMM is now deprecated☆61Updated last year
- nGraph™ Backend for ONNX☆42Updated last year
- GPU Automatically Tuned Linear Algebra Software☆28Updated 9 years ago
- BLAS OpenCL implementation.☆15Updated 9 years ago
- MCMC for the Dark Energy Spectroscopic Instrument☆13Updated 8 years ago
- Fast stand-alone C++ decoder for RNN-based NMT models☆25Updated 3 years ago
- Benchmark supporting baseless libel against clang-format☆11Updated 4 years ago
- C++ to OpenCL C Source-to-source Translation☆13Updated 10 years ago
- Communication-Minimizing 2D Convolution in GPU Registers☆30Updated 11 years ago
- convert the deep-residual-network(50, 101, 152) from caffe to mxnet☆10Updated 8 years ago
- The "CUDA templates" are a collection of C++ template classes and functions which provide a consistent interface to NVIDIA's "Compute Uni…☆27Updated 13 years ago