TravisWThompson1 / Makefile_Example_CUDA_CPP_To_ExecutableLinks

Example Makefile for CUDA and C++ source files in a standard project layout.

☆47

Alternatives and similar repositories for Makefile_Example_CUDA_CPP_To_Executable

Users that are interested in Makefile_Example_CUDA_CPP_To_Executable are comparing it to the libraries listed below

Sorting:

FZJ-JSC / tutorial-multi-gpu
Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial
☆280Updated last month
yzhaiustc / Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F
Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.
☆150Updated 3 years ago
wangzyon / NVIDIA_SGEMM_PRACTICE
Step-by-step optimization of CUDA SGEMM
☆355Updated 3 years ago
olcf / cuda-training-series
Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)
☆816Updated 10 months ago
poojahira / spmv-cuda
Implementation and analysis of five different GPU based SPMV algorithms in CUDA
☆41Updated 6 years ago
RRZE-HPC / gpu-benches
collection of benchmarks to measure basic GPU capabilities
☆393Updated 5 months ago
NVIDIA / multi-gpu-programming-models
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
☆754Updated 4 months ago
leimao / CUDA-GEMM-Optimization
CUDA Matrix Multiplication Optimization
☆202Updated 11 months ago
yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
☆366Updated 6 months ago
cwpearson / nvidia-performance-tools
Instructions, Docker images, and examples for Nsight Compute and Nsight Systems
☆132Updated 5 years ago
leimao / CUTLASS-Examples
CUTLASS and CuTe Examples
☆63Updated this week
wzsh / wmma_tensorcore_sample
Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)
☆138Updated 4 years ago
NVIDIA / nvbench
CUDA Kernel Benchmarking Library
☆682Updated last week
deeperlearning / professional-cuda-c-programming
☆448Updated 10 years ago
Bruce-Lee-LY / cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruct…
☆439Updated 10 months ago
essentialsofparallelcomputing / EssentialsOfParallelComputing
Main Book repository for the Parallel and High Performance Computing book, Manning Publications
☆209Updated 3 years ago
RichardAns / CUDA-Programs
Examples from Programming in Parallel with CUDA
☆157Updated 2 years ago
zjin-lcf / HeCBench
☆248Updated last month
Cjkkkk / CUDA_gemm
A simple high performance CUDA GEMM implementation.
☆386Updated last year
NVIDIA / NVTX
The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…
☆415Updated this week
pwlnk / cuda-neural-network
Simple neural network implementation using CUDA technology. It is an educational implementation.
☆96Updated 7 years ago
Yinghan-Li / YHs_Sample
Yinghan's Code Sample
☆339Updated 2 years ago
NVIDIA / nsight-training
Training material for Nsight developer tools
☆161Updated 11 months ago
cloudcores / CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
☆516Updated 2 years ago
siboehm / SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
☆764Updated last year
KernelTuner / kernel_tuner
Kernel Tuner
☆353Updated this week
CodedK / CUDA-by-Example-source-code-for-the-book-s-examples-
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …
☆427Updated 2 years ago
Liu-xiandong / How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…
☆1,093Updated last year
XiaoSong9905 / CUDA-Optimization-Guide
Xiao's CUDA Optimization Guide [NO LONGER ADDING NEW CONTENT]
☆305Updated 2 years ago
ROCm / composable_kernel
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
☆437Updated this week