tpn/cuda-samples

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tpn/cuda-samples)

tpn / cuda-samples

☆64

Alternatives and similar repositories for cuda-samples

Users that are interested in cuda-samples are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

srkiranraj / spgemm
View on GitHub
Sparse matrix-matrix multiplication on CPU+GPU systems.
☆13Mar 17, 2014Updated 12 years ago
SJTU-IPADS / hackwrench
View on GitHub
☆12Apr 26, 2023Updated 3 years ago
Orion34-lanbo / tvm-batch-matmul-example
View on GitHub
☆24Mar 22, 2018Updated 8 years ago
freeCompilerCamp / play-with-compiler
View on GitHub
Play-with-compiler sandbox based on PWD
☆10Oct 22, 2020Updated 5 years ago
Atlantic777 / mpudp
View on GitHub
Multi-path UDP protocol - an example implementation
☆10Jul 6, 2015Updated 11 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
matiaslindgren / cuda-memory-access-recorder
View on GitHub
Record GPU memory accesses of a CUDA program and visualize the access pattern in a browser
☆13Nov 17, 2020Updated 5 years ago
chunhualiao / freeCompilerCamp
View on GitHub
Goal: a website to automatically train and certify compiler researchers and developers
☆10Nov 24, 2019Updated 6 years ago
Xilinx / HPCG_FPGA
View on GitHub
☆10May 20, 2022Updated 4 years ago
freeCompilerCamp / code-for-llvm-tutorials
View on GitHub
We try to put source files of llvm tutorials here
☆18Oct 6, 2020Updated 5 years ago
marcsous / gpuSparse
View on GitHub
Matlab mex wrappers to cuSPARSE (NVIDIA)
☆11Dec 10, 2025Updated 7 months ago
ethz-asl / pcl_catkin
View on GitHub
Catkinized version of the latest version of PCL (http://pointclouds.org/)
☆13Apr 9, 2020Updated 6 years ago
SciML / TensorFlowDiffEq.jl
View on GitHub
Using TensorFlow for physics-informed neural networks for scientific machine learning (SciML)
☆16Nov 30, 2020Updated 5 years ago
alisure-ml / Semantic-Segmentation-DilatedConvolution
View on GitHub
膨胀卷积，Multi Scale Context Aggregation by Dilated Convolutions的实现
☆12Dec 24, 2017Updated 8 years ago
gchaw / wattless
View on GitHub
GPU-accelerated AES encryption project
☆11Feb 13, 2015Updated 11 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
AlbertLiDesign / ALG_MarchingCubes_GPU
View on GitHub
ALG_MarchingCubes_GPU is an isosurface extraction plug-in for Grasshopper using Marching Cubes algorithm on GPU.
☆14Apr 17, 2020Updated 6 years ago
JustKshitijD / Harmonia_for_B_plus_trees
View on GitHub
Harmonia is an algorithm that allows for the implementation of operations on B+ trees using parallelization. As a part of my GPU project,…
☆31Aug 8, 2021Updated 4 years ago
upenn-acg / gpuDranoDynamicAnalysis
View on GitHub
An llvm pass for counting global uncoalesced acceses for cuda code via dynamic analysis.
☆14Nov 17, 2018Updated 7 years ago
WaveSpeedAI / QuantumAttention
View on GitHub
[WIP] Better (FP8) attention for Hopper
☆33Feb 24, 2025Updated last year
NVIDIA / otk-shader-util
View on GitHub
Vector math and other CUDA helper functions for OptiX kernels
☆10Oct 21, 2024Updated last year
flame / tblis-strassen
View on GitHub
Strassen's Algorithm for Tensor Contraction
☆15Jul 7, 2017Updated 9 years ago
IbrahimFathy19 / Big-C
View on GitHub
Big C++ Book by Cay S. Horstmann, 2nd Edition, Solutions of problems and Exercises
☆18Jul 11, 2018Updated 8 years ago
typ0520 / fastdex-test-project
View on GitHub
☆11Nov 2, 2017Updated 8 years ago
wme7 / MultiGPU_AdvectionDiffusion
View on GitHub
Multi-GPU (CUDA-MPI) baseline implementation of Heat Equation and the inviscid Burgers' equation
☆12Oct 17, 2017Updated 8 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
gmarciani / cudawesome
View on GitHub
A collection of awesome algorithms, implemented in CUDA.
☆26Feb 6, 2018Updated 8 years ago
baidu-research / catamount
View on GitHub
Catamount is a compute graph analysis tool to load, construct, and modify deep learning models and to symbolically analyze their compute …
☆14May 18, 2021Updated 5 years ago
green-anger / MemoryPool
View on GitHub
Fast Efficient Fixed-Size Memory Pool
☆15Dec 12, 2018Updated 7 years ago
sgl-project / DeepGEMM
View on GitHub
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
☆32Updated this week
UofT-EcoSystem / Tempo
View on GitHub
Memory footprint reduction for transformer models
☆11Jan 24, 2023Updated 3 years ago
haf / AleaGPUTutorial
View on GitHub
Tutorial and samples for Alea GPU compiler.
☆17Oct 3, 2015Updated 10 years ago
RAttab / luger
View on GitHub
A direct-to-syslog logger over udp for Erlang
☆11Sep 1, 2020Updated 5 years ago
vortexgpgpu / NVPTX-SPIRV-Translator
View on GitHub
The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.
☆45Oct 25, 2021Updated 4 years ago
ishanhan / parallel-implementation-of-kmeans
View on GitHub
Parallel implementation of k-means clustering using MPI4PY and PyCUDA.
☆10Mar 11, 2019Updated 7 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
sharan-dce / autograd
View on GitHub
Auto-differentiation library for C++
☆12Jan 16, 2022Updated 4 years ago
dominiquegarmier / grok-pytorch
View on GitHub
pytorch implementation of grok
☆11Jul 13, 2026Updated last week
hwanhuh / diff-surfel-rasterization-MCMC
View on GitHub
A differentiable rasterizer used in the project "2D Gaussian Splatting"
☆14Jul 23, 2024Updated 2 years ago
MTopOpt / Levelset_AdaptiveMesh
View on GitHub
Parallel solver for levelset topology optimization method with adaptive mesh refinement
☆14Nov 13, 2020Updated 5 years ago
linnanwang / BLASX
View on GitHub
a heterogeneous multiGPU level-3 BLAS library
☆46Dec 9, 2019Updated 6 years ago
TomaszRewak / RotatingVoxels
View on GitHub
In this project I use C#, Alea GPU and OpenGL.Net to create a simple, hardware-accelerated, 3d animation of rotating cubes.
☆12Jul 7, 2019Updated 7 years ago
flame / fmm-gen
View on GitHub
Generating Families of Practical Fast Matrix Multiplication Algorithms
☆12Jul 7, 2017Updated 9 years ago