godweiyang / NN-CUDA-ExampleLinks

Several simple examples for popular neural network toolkits calling custom CUDA operators.

☆1,488

Alternatives and similar repositories for NN-CUDA-Example

Users that are interested in NN-CUDA-Example are comparing it to the libraries listed below

Sorting:

Tony-Tan / CUDA_Freshman
☆2,503Updated last year
BBuf / how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
☆2,345Updated this week
huiscliu / Tutorials
Parallel programming tutorials
☆628Updated 4 years ago
brucefan1983 / CUDA-Programming
Sample codes for my CUDA programming book
☆1,765Updated 5 months ago
LitLeo / TensorRT_Tutorial
☆1,030Updated last year
NVIDIA / trt-samples-for-hackathon-cn
Simple samples for TensorRT programming
☆1,627Updated 2 months ago
Liu-xiandong / How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…
☆1,102Updated 2 years ago
DeepVAC / deepvac
PyTorch Project Specification.
☆680Updated 3 years ago
mlc-ai / mlc-zh
☆611Updated last year
OpenPPL / ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
☆1,714Updated last year
tczhangzhi / pytorch-distributed
A quickstart and benchmark for pytorch distributed training.
☆1,668Updated last year
HeKun-NVIDIA / CUDA-Programming-Guide-in-Chinese
This is a Chinese translation of the CUDA programming guide
☆1,623Updated 8 months ago
Oldpan / Pytorch-Memory-Utils
pytorch memory track code
☆1,017Updated 4 years ago
mli / transformers-benchmarks
real Transformer TeraFLOPS on various GPUs
☆911Updated last year
PaddleJitLab / CUDATutorial
A self-learning tutorail for CUDA High Performance Programing.
☆686Updated last month
Tongkaio / CUDA_Kernel_Samples
CUDA 算子手撕与面试指南
☆511Updated 6 months ago
BBuf / tvm_mlir_learn
compiler learning resources collect.
☆2,457Updated 4 months ago
BBuf / how-to-learn-deep-learning-framework
how to learn PyTorch and OneFlow
☆441Updated last year
lartpang / PyTorchTricks
Some tricks of pytorch...
☆1,195Updated last year
RussWong / CUDATutorial
A CUDA tutorial to make people learn CUDA program from 0
☆247Updated last year
zc911 / MatrixSlow
A simple deep learning framework in pure python for purpose of learning in DL
☆442Updated 5 months ago
OpenPPL / ppl.nn
A primitive library for neural network
☆1,345Updated 8 months ago
ifromeast / cuda_learning
learning how CUDA works
☆291Updated 4 months ago
QINZHAOYU / CudaSteps
基于《cuda编程-基础与实践》（樊哲勇著）的cuda学习之路。
☆337Updated last year
depctg / udacity-cs344-colab
Google Colab Notebooks for Udacity CS344 - Intro to Parallel Programming
☆135Updated 4 years ago
Jack47 / hack-SysML
The road to hack SysML and become an system expert
☆494Updated 10 months ago
ModelTC / MQBench
Model Quantization Benchmark
☆826Updated 3 months ago
borgwang / tinynn
A lightweight deep learning library
☆388Updated last month
MAhaitao999 / CUDA_Programming
《CUDA编程基础与实践》一书的代码
☆127Updated 3 years ago
YouQixiaowu / CUDA-Programming-with-Python
关于书籍CUDA Programming使用了pycuda模块的Python版本的示例代码
☆255Updated 5 years ago