gmarciani / cudawesomeLinks

A collection of awesome algorithms, implemented in CUDA.

☆25

Alternatives and similar repositories for cudawesome

Users that are interested in cudawesome are comparing it to the libraries listed below

Sorting:

rbaygildin / learn-gpgpu
Algorithms implemented in CUDA + resources about GPGPU
☆56Updated 3 years ago
CUDA-Tutorial / CodeSamples
Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"
☆91Updated last year
jslee02 / awesome-gpgpu
A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources
☆99Updated 2 years ago
ndd314 / cuda_examples
☆68Updated 11 years ago
ProjectPhysX / PTXprofiler
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
☆55Updated 4 months ago
Ahdhn / CUDATemplate
Template for starting CUDA/C++ project using CMake with Github Action for CI
☆31Updated last month
Erkaman / Awesome-CUDA
This is a list of useful libraries and resources for CUDA development.
☆578Updated 7 years ago
tpn / cuda-samples
☆61Updated 2 years ago
Apress / data-parallel-CPP
Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben A…
☆275Updated 4 months ago
owensgroup / BGHT
BGHT: High-performance static GPU hash tables.
☆70Updated last month
mikeroyal / CUDA-Guide
CUDA Guide
☆72Updated last year
CUDACommunity / CUDACommunityMeetup2021
☆23Updated 3 years ago
ROCm / HIP-Examples
Examples for HIP
☆210Updated 8 months ago
PatWie / cuda-design-patterns
Some CUDA design patterns and a bit of template magic for CUDA
☆156Updated 2 years ago
robertmaynard / code-samples
Source code examples from the Parallel Forall Blog
☆96Updated 6 years ago
ashvardanian / ParallelReductionsBenchmark
Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal, and Rust - all it takes to sum a lot of numbers fast!
☆103Updated 2 weeks ago
codeplaysoftware / SYCL-For-CUDA-Examples
Examples for using SYCL on CUDA
☆62Updated last month
ecrc / kblas-gpu
Subset of BLAS routines optimized for NVIDIA GPUs
☆71Updated 2 years ago
gunrock / loops
🎃 GPU load-balancing library for regular and irregular computations.
☆62Updated last year
mattdean1 / cuda
An implementation of parallel exclusive scan in CUDA
☆62Updated 7 years ago
NVIDIA / nsight-training
Training material for Nsight developer tools
☆163Updated last year
amd / amd-lab-notes
AMD lab notes with code examples to demonstrate use of AMD GPUs
☆100Updated last year
horizon-research / rtnn
☆67Updated 2 years ago
enginBozkurt / CUDA-Programming
GPU Parallel Computing software solution examples with CUDA
☆14Updated 7 years ago
codeplaysoftware / portDNN
portDNN is a library implementing neural network algorithms written using SYCL
☆113Updated last year
eegkno / CUDA_by_practice
CUDA by practice
☆129Updated 5 years ago
puttsk / cuda-tutorial
A set of hands-on tutorials for CUDA programming
☆230Updated last year
KernelTuner / kernel_tuner
Kernel Tuner
☆356Updated 2 weeks ago
ThoenigAdrian / NeuralNetworksCudaTutorial
Implement Neural Networks in Cuda from Scratch
☆23Updated last year
ysh329 / OpenMP-101
Learn OpenMP examples step by step
☆95Updated 6 months ago