kashif / cuda-workshopLinks
Code examples for the CUDA workshop
☆36Updated 3 years ago
Alternatives and similar repositories for cuda-workshop
Users that are interested in cuda-workshop are comparing it to the libraries listed below
Sorting:
- Introduction to CUDA programming☆129Updated 8 years ago
- This is a cross-platform, CUDA-based C++ library for general-purpose, unconstrained nonlinear optimization on the GPU. It implements the …☆138Updated 5 years ago
- Utilities for CUDA programming☆42Updated 6 years ago
- This repository contains easy-to-read Python/CUDA implementations of fundamental GPU computing primitives.☆36Updated 10 years ago
- Resources to work offline on the assignments of Heterogenous Parallel Programming course from Coursera.☆72Updated 6 years ago
- ☆43Updated 8 years ago
- PLEASE SEE THE OFFICIAL REPOSITORY. THIS IS NOT MAINTAINED ANYMORE.☆93Updated 6 years ago
- Example of how to use CUDA with CMake >= 3.8☆70Updated 7 months ago
- A CUDA implementation of the k-means clustering algorithm☆255Updated 13 years ago
- Python Binding to NVRTC☆79Updated last year
- Python Framework for sparse neural networks☆19Updated 8 years ago
- FastHOG library that has been fixed to work with CUDA 5.x on Ubuntu 12.04☆20Updated 12 years ago
- kmeans clustering with multi-GPU capabilities☆122Updated 2 years ago
- a heterogeneous multiGPU level-3 BLAS library☆46Updated 6 years ago
- GPU implementation of classical molecular dynamics proxy application.☆31Updated 9 years ago
- kmeans☆54Updated 9 years ago
- Fork of magma to include more BLAS☆28Updated 9 years ago
- This example builds on the parallel-forall repo separate compilation example by adding CMake to it.☆17Updated 8 years ago
- Deep neural network framework for multiple GPUs☆34Updated 10 years ago
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 8 years ago
- FluidNet re-written with ATen tensor lib☆52Updated 6 years ago
- My solutions to Udacity's Parallel Programming course (CS 344)☆76Updated 8 years ago
- Simple example of implementing a new Tensorflow operation and its gradient in C++.☆56Updated 6 years ago
- Example code used in the CVPR 2015 tutorial☆42Updated 10 years ago
- Multi-core CPU implementation of deep learning for 2D and 3D sliding window convolutional networks (ConvNets).☆94Updated 9 years ago
- Source code that accompanies The CUDA Handbook.☆566Updated 4 months ago
- Some CUDA design patterns and a bit of template magic for CUDA☆158Updated 2 years ago
- Parallel network flows using OpenMP and CUDA.☆28Updated 7 years ago
- CNNs in Halide☆23Updated 10 years ago
- A tutorial series for learning OpenCL☆288Updated 10 years ago