wykvictor / cs344-cuda-udacityLinks

Windows Visual Studio Solutions for class "Introduction to Parallel Programming"

☆19

Alternatives and similar repositories for cs344-cuda-udacity

Users that are interested in cs344-cuda-udacity are comparing it to the libraries listed below

Sorting:

DrZhang99 / algorithms-cuda
parallel algorithm based on cuda
☆60Updated 7 years ago
yanqswhu / cuda_by_example
The CMake version of cuda_by_example
☆149Updated 5 years ago
zhxfl / CUDA-CNN
CNN accelerated by cuda. Test on mnist and finilly get 99.76%
☆186Updated 7 years ago
nickspell / udacity-IntroToParallelProgramming
CS344 - Introduction To Parallel Programming course (Udacity) proposed solutions
☆53Updated 8 years ago
hyln9 / GCNGEMM
Optimized half precision gemm assembly kernels (deprecated due to ROCm)
☆47Updated 8 years ago
yuxianzhi / Top-K
A way to use cuda to accelerate top k algorithm
☆29Updated 8 years ago
merrymercy / tvm-mali
Optimizing Mobile Deep Learning on ARM GPU with TVM
☆181Updated 6 years ago
PerfXLab / embedded_ai
☆209Updated 7 years ago
mz24cn / clnet
OpenCL for Nets - A Deep Learning Framework based on OpenCL, written by C++. Supports popular MLP, RNN(LSTM), CNN(ResNet). Friendly debug…
☆68Updated 6 years ago
TLESORT / YOLO-TensorRT-GIE-
This code is an implementation of a trained YOLO neural network used with the TensorRT framework.
☆88Updated 8 years ago
OrangeOwlSolutions / General-CUDA-programming
☆44Updated 7 years ago
xuqiantong / CUDA-Winograd
Fast CUDA Kernels for ResNet Inference.
☆177Updated 6 years ago
ctuning / ck-tensorrt
Collective Knowledge repository for NVIDIA's TensorRT
☆37Updated 4 years ago
CSshengxy / MEC
ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)
☆17Updated 6 years ago
vinx13 / tvm-cuda-int8-benchmark
Benchmark of TVM quantized model on CUDA
☆111Updated 5 years ago
dorthyluu / cs194-winograd
☆26Updated 8 years ago
lukeyeager / cmake-cuda-example
Example of how to use CUDA with CMake >= 3.8
☆70Updated last month
IntelLabs / SkimCaffe
Caffe for Sparse Convolutional Neural Network
☆238Updated 2 years ago
CharlieCurry / tvm-learning
TVM learning and research
☆13Updated 4 years ago
Orion34-lanbo / tvm-batch-matmul-example
☆24Updated 7 years ago
ysh329 / OpenCL-101
Learn OpenCL step by step.
☆138Updated 2 years ago
ravi-teja-mullapudi / Halide-NN
CNNs in Halide
☆23Updated 9 years ago
carlushuang / cpu_gemm_opt
how to design cpu gemm on x86 with avx256, that can beat openblas.
☆71Updated 6 years ago
csehydrogen / Winograd-OpenCL
Winograd-based convolution implementation in OpenCL
☆28Updated 8 years ago
XiuYuLi / flexible-gemm
flexible-gemm conv of deepcore
☆17Updated 5 years ago
xingyul / sparse-winograd-cnn
Efficient Sparse-Winograd Convolutional Neural Networks (ICLR 2018)
☆191Updated 6 years ago
ibebrett / CUDA-CS344
My solutions to Udacity's Parallel Programming course (CS 344)
☆75Updated 8 years ago
OAID / MXNet-HRT
Heterogeneous Run Time version of MXNet. Added heterogeneous capabilities to the MXNet, uses heterogeneous computing infrastructure frame…
☆72Updated 7 years ago
cwlacewe / netscope
This is a CNN Analyzer tool, based on Netscope by dgschwend/netscope
☆42Updated 7 years ago
ArchaeaSoftware / cudahandbook
Source code that accompanies The CUDA Handbook.
☆532Updated 6 months ago