Examples from Programming in Parallel with CUDA
☆170Feb 5, 2026Updated 3 weeks ago
Alternatives and similar repositories for CUDA-Programs
Users that are interested in CUDA-Programs are comparing it to the libraries listed below
Sorting:
- CUDA Learning guide☆531Jun 20, 2024Updated last year
- Read custom dataset☆12Mar 31, 2023Updated 2 years ago
- Automatic differentiation of FEniCS and Firedrake models in Julia☆13Mar 21, 2021Updated 4 years ago
- Some simple MPI programs using mpi4py☆14Jun 18, 2024Updated last year
- Learn CUDA Programming, published by Packt☆1,231Dec 30, 2023Updated 2 years ago
- ☆14Feb 13, 2018Updated 8 years ago
- An Attention Superoptimizer☆22Jan 20, 2025Updated last year
- Library of CUDA Kernels for Signal Processing☆17Apr 4, 2022Updated 3 years ago
- Examples of Fortran MPI 3.0 / mpi_f08☆18Feb 11, 2026Updated 2 weeks ago
- Step-by-step optimization of CUDA SGEMM☆432Mar 30, 2022Updated 3 years ago
- The Exascale Computing Project Software Technologies Capability Assessment Report - Public Version☆21Aug 18, 2022Updated 3 years ago
- CUDA training materials☆21Aug 25, 2024Updated last year
- Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruct…☆526Sep 8, 2024Updated last year
- This is a fast GPU SPH framework☆21Jul 6, 2025Updated 7 months ago
- CUDA Core Compute Libraries☆2,182Updated this week
- Implementations of 2D Image Convolution algorithm with CUDA (using global memory, shared memory and constant memory)☆17Jan 21, 2018Updated 8 years ago
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆95Aug 14, 2023Updated 2 years ago
- Development of SuiteSparse.jl, which ships as part of the Julia standard library.☆26Nov 6, 2022Updated 3 years ago
- zenus parallel computing library for zenus physics-based simulations☆96Jul 6, 2025Updated 7 months ago
- ☆24Feb 20, 2024Updated 2 years ago
- Samples for CUDA Developers which demonstrates features in CUDA Toolkit☆8,870Jan 6, 2026Updated last month
- ☆31Updated this week
- ☆33Mar 31, 2025Updated 11 months ago
- Files for blog posts on R for statistics at http://www.juanklopper.com☆15Mar 7, 2021Updated 4 years ago
- Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.☆72Sep 8, 2024Updated last year
- Data-driven Geometric Multi-Grid solver for the discrete Poisson equation☆41Apr 17, 2022Updated 3 years ago
- ☆26Aug 9, 2025Updated 6 months ago
- An efficient C++20 GPU numerical computing library with Python-like syntax☆1,405Updated this week
- CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API☆35Sep 15, 2023Updated 2 years ago
- FEM implementation with Stable Neo-Hookean energy in libigl.☆35Dec 9, 2020Updated 5 years ago
- Python project dedicated to creating an open-source CAD designer using implicit equations.☆32Feb 20, 2024Updated 2 years ago
- Graph-indexed Pandas DataFrames for analyzing hierarchical performance data☆34Jan 30, 2026Updated last month
- GPU programming related news and material links☆1,997Sep 17, 2025Updated 5 months ago
- Introduction to Python Tutorial at SciPy 2019☆34Jul 8, 2019Updated 6 years ago
- 使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention☆79Aug 12, 2024Updated last year
- Accurate, Hardware Accelerated, Special Functions in Mojo 🔥☆37Dec 5, 2024Updated last year
- ☆31Apr 22, 2024Updated last year
- A simple high performance CUDA GEMM implementation.☆426Jan 4, 2024Updated 2 years ago
- C and Python examples from my book on using PETSc and Firedrake to solve PDEs☆218Feb 8, 2026Updated 3 weeks ago