kashif / cuda-workshop
Code examples for the CUDA workshop
☆35Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for cuda-workshop
- Python Framework for sparse neural networks☆19Updated 7 years ago
- FluidNet re-written with ATen tensor lib☆51Updated 5 years ago
- A fork of Eigen 3.2 to use MAGMA (GPU & CPU) as backend in the same way it does with Intel MKL.☆48Updated 10 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Updated 7 years ago
- Utilities for CUDA programming☆39Updated 5 years ago
- This repository contains easy-to-read Python/CUDA implementations of fundamental GPU computing primitives.☆36Updated 9 years ago
- A GPU / CPU implementation of a feed forward neural network☆33Updated 9 years ago
- CNNs in Halide☆23Updated 9 years ago
- Fork of magma to include more BLAS☆28Updated 7 years ago
- Resources to work offline on the assignments of Heterogenous Parallel Programming course from Coursera.☆71Updated 5 years ago
- ☆42Updated 6 years ago
- CVPR 2017 notes☆24Updated 6 years ago
- Workshop on the future of gradient-based machine learning software, NIPS 2017, 2016☆15Updated 6 years ago
- Hierarchical Image Representation☆10Updated 11 months ago
- Example code used in the CVPR 2015 tutorial☆39Updated 9 years ago
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- Generating 3D CAD models with manifolds trained by Gaussian Process Latent Variable Models☆20Updated 8 years ago
- FastHOG library that has been fixed to work with CUDA 5.x on Ubuntu 12.04☆19Updated 10 years ago
- PLEASE SEE THE OFFICIAL REPOSITORY. THIS IS NOT MAINTAINED ANYMORE.☆93Updated 4 years ago
- Python Binding to NVRTC☆79Updated last month
- C++ library for numerical arrays and tensor objects and operations with them, designed to allow Matlab-style programming.☆51Updated last year
- A Cython interface to FLANN☆25Updated 3 years ago
- A shallow fork of SuiteSparse adding build files for Visual Studio and support for ACML☆100Updated 9 years ago
- CUDA Random Forest implementation for Image Labeling tasks☆180Updated 4 years ago
- Automatic Differentiation for OpenCL.☆21Updated 9 years ago
- a fully-differentiable graphical raytracer☆15Updated 9 years ago
- Simple example of implementing a new Tensorflow operation and its gradient in C++.☆56Updated 5 years ago
- Communication-Minimizing 2D Convolution in GPU Registers☆30Updated 11 years ago
- Non-Negative Least Squares implementation for Eigen3☆37Updated last year