CisMine/Guide-NVIDIA-Tools

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CisMine/Guide-NVIDIA-Tools)

CisMine / Guide-NVIDIA-Tools

NVIDIA tools guide

☆168

Alternatives and similar repositories for Guide-NVIDIA-Tools

Users that are interested in Guide-NVIDIA-Tools are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CisMine / Setup_dataset
View on GitHub
Read custom dataset
☆12Mar 31, 2023Updated 3 years ago
CisMine / Parallel-Computing-Cuda-C
View on GitHub
CUDA Learning guide
☆566Jun 20, 2024Updated 2 years ago
CisMine / Setup-as-Cuda-programmers
View on GitHub
Setup Cuda
☆29May 23, 2024Updated 2 years ago
CisMine / GPU-in-ML-DL
View on GitHub
Apply GPU in ML and DL
☆68Mar 23, 2026Updated 3 months ago
rpgolshan / CUDA-image-processing
View on GitHub
☆15Feb 13, 2018Updated 8 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
ConsciousML / img-processing-cuda
View on GitHub
Implementation from scratch in CUDA C++ of image processing algorithms.
☆24Oct 26, 2020Updated 5 years ago
mailrocketsystems / CudaSetupUbuntu20
View on GitHub
☆14Apr 10, 2023Updated 3 years ago
pietrobongini / CUDA-ImageConvolution
View on GitHub
Implementations of 2D Image Convolution algorithm with CUDA (using global memory, shared memory and constant memory)
☆17Jan 21, 2018Updated 8 years ago
alexarmbr / matmul-playground
View on GitHub
☆29Apr 7, 2025Updated last year
ThoenigAdrian / NeuralNetworksCudaTutorial
View on GitHub
Implement Neural Networks in Cuda from Scratch
☆23May 17, 2024Updated 2 years ago
eth-cscs / Tiled-MM
View on GitHub
Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.
☆33Apr 2, 2025Updated last year
Snektron / gpumode-amd-fp8-mm
View on GitHub
My submission for the GPUMODE/AMD fp8 mm challenge
☆29Jun 4, 2025Updated last year
salykova / sgemm.cu
View on GitHub
High-Performance FP32 GEMM on CUDA devices
☆126Jan 21, 2025Updated last year
ademeure / cuda-side-boost
View on GitHub
☆60Feb 24, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
CUDA-Tutorial / CodeSamples
View on GitHub
Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"
☆99Aug 14, 2023Updated 2 years ago
KhosroBahrami / ImageFiltering_CUDA
View on GitHub
Image Filtering using CUDA
☆30Mar 22, 2019Updated 7 years ago
openhackathons-org / HPC_Profiler
View on GitHub
Profiling with NVIDIA Nsight Tools Bootcamp
☆24Feb 4, 2026Updated 5 months ago
oddity-ai / async-cuda
View on GitHub
Asynchronous CUDA for Rust.
☆40Jun 30, 2026Updated 3 weeks ago
Ratbuyer / h100-features
View on GitHub
☆18Mar 12, 2025Updated last year
siboehm / SGEMM_CUDA
View on GitHub
Fast CUDA matrix multiplication from scratch
☆1,256Sep 2, 2025Updated 10 months ago
ademeure / QuickRunCUDA
View on GitHub
☆20May 30, 2026Updated last month
leimao / CUTLASS-Examples
View on GitHub
CUTLASS and CuTe Examples
☆136Nov 30, 2025Updated 7 months ago
luongthecong123 / fp8-quant-matmul
View on GitHub
Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.
☆19Feb 9, 2026Updated 5 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
SzymonOzog / GPU_Programming
View on GitHub
☆98May 30, 2026Updated last month
lionlai1989 / GPU_Programming_Specialization
View on GitHub
My study notes and hands-on projects for CUDA-based GPU programming
☆13Dec 11, 2025Updated 7 months ago
yifuwang / symm-mem-recipes
View on GitHub
☆170Dec 27, 2024Updated last year
gpu-mode / resource-stream
View on GitHub
GPU programming related news and material links
☆2,233Jun 15, 2026Updated last month
iankur / vqllm
View on GitHub
Residual vector quantization for KV cache compression in large language model
☆12Oct 22, 2024Updated last year
IST-DASLab / gemm-fp8
View on GitHub
High Performance FP8 GEMM Kernels for SM89 and later GPUs.
☆21Jan 24, 2025Updated last year
vdesai2014 / inference-optimization-blog-post
View on GitHub
☆92Feb 29, 2024Updated 2 years ago
atharvaaalok / deepfusion
View on GitHub
A highly modular and customizable Deep Learning framework.
☆25May 23, 2024Updated 2 years ago
gpu-mode / kernelbot
View on GitHub
Write a fast kernel and see how you compare against the best humans and AI on gpumode.com
☆103Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
dropbox / gemlite
View on GitHub
Fast low-bit matmul kernels in Triton
☆477Updated this week
rapidsai / deployment
View on GitHub
RAPIDS Deployment Documentation
☆15Updated this week
pranjalssh / fast.cu
View on GitHub
Fastest kernels written from scratch
☆583Sep 18, 2025Updated 10 months ago
oyanghd / gpu-compress-decompress
View on GitHub
Graphics card often idling? Is the decompression speed of common tools too slow? This project is a GPU + multi-process, multi-thread comp…
☆11Dec 4, 2023Updated 2 years ago
NTT123 / cute-viz
View on GitHub
Cute layout visualization
☆43Jan 18, 2026Updated 6 months ago
mcrl / tccl
View on GitHub
Thunder Research Group's Collective Communication Library
☆53Jul 8, 2025Updated last year
hpdps-group / ICS23-GPULZ
View on GitHub
GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs
☆16Apr 18, 2025Updated last year