Multiple 1-stencil implementations using nvidia cuda.
☆13Dec 2, 2017Updated 8 years ago
Alternatives and similar repositories for one_stencil
Users that are interested in one_stencil are comparing it to the libraries listed below
Sorting:
- Mini-applications that exclusively use the Kokkos programming model☆12Mar 21, 2023Updated 2 years ago
- Simian Process Oriented Conservative JIT PDES from LANL☆13Dec 12, 2025Updated 2 months ago
- Absinthe is an optimization framework to fuse and tile stencil codes in one shot☆14Jul 17, 2019Updated 6 years ago
- Code repo for lotsofcores.com book 1, here since dropbox doesn't work for everyone☆27Apr 8, 2016Updated 9 years ago
- Simple LBM kernels for benchmarking and performance evaluation☆14Jun 6, 2018Updated 7 years ago
- Parallel direct solver for Poisson's equation for pressure☆11Aug 22, 2020Updated 5 years ago
- Orio is an open-source extensible framework for the definition of domain-specific languages and generation of optimized code for multiple…☆37Dec 13, 2025Updated 2 months ago
- Automatically exported from code.google.com/p/patus☆16Sep 3, 2015Updated 10 years ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆35Jul 28, 2020Updated 5 years ago
- A library for C++/Fortran computer simulations (e.g. stencil codes, mesh-free, unstructured grids, n-body & particle methods). Scales fro…☆40Apr 13, 2021Updated 4 years ago
- A GPU-friendly materials library with a focus on crystal plasticity methods☆23Dec 1, 2025Updated 3 months ago
- Heterogeneous Accelerated Computed Cluster (HACC) Resources Page☆22Oct 7, 2025Updated 4 months ago
- ☆14Jul 28, 2016Updated 9 years ago
- ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)☆17Apr 9, 2019Updated 6 years ago
- Performance Prediction Toolkit☆56Sep 13, 2025Updated 5 months ago
- Accelerate MCMC algorithm on GPU for Big Data Applications☆23Jun 22, 2018Updated 7 years ago
- ☆50Jun 27, 2019Updated 6 years ago
- This tool serves as a test harness for different optimization techniques to improve stencil computations performance in shared and distri…☆21Nov 9, 2022Updated 3 years ago
- P3DFFT++ (a.k.a. P3DFFT v. 3) is a new generation of P3DFFT library that aims to provide a comprehensive framework for simulating multis…☆22Aug 7, 2023Updated 2 years ago
- A C++ allocator based on cudaMallocManaged☆23Nov 19, 2018Updated 7 years ago
- A GPU cache model for research purposes☆28Nov 4, 2013Updated 12 years ago
- GPU Code optimizer for stencil computations. Refer to our IPDPS'19 paper for more details☆25Sep 27, 2019Updated 6 years ago
- A hydrodynamics mini-app to solve the compressible Euler equations in 2D, using an explicit, second-order method.☆62Jul 3, 2020Updated 5 years ago
- The fast Finite Volume simulator with UQ support.☆28Feb 16, 2025Updated last year
- ☆21Jul 28, 2016Updated 9 years ago
- a wavelet-based multifractal image analysis tool implementing the WTMM (Wavelet Transform Modulus Maxima) method.☆11Feb 1, 2020Updated 6 years ago
- Experimental ranges for CUDA☆25Feb 1, 2019Updated 7 years ago
- Deprecated to NaluCFD/Nalu or ExaWind/Nalu-Wind. See LICENSE for more information.☆28Mar 9, 2017Updated 8 years ago
- Comb is a communication performance benchmarking tool.☆26Feb 27, 2023Updated 3 years ago
- ☆25Feb 20, 2024Updated 2 years ago
- Visualization tool for analyzing call trees and graphs☆35Mar 15, 2023Updated 2 years ago
- An open-source custom cache generator.☆34Mar 14, 2024Updated last year
- Parallel GDB developed for debugging HPC code at Lawrence Livermore National Laboratory.☆32Nov 3, 2015Updated 10 years ago
- Neural Network Based Lattice Boltzmann solver☆26Oct 25, 2018Updated 7 years ago
- QMCPACK miniapp: a simplified real space QMC code for algorithm development, performance portability testing, and computer science experi…☆27Jul 24, 2024Updated last year
- CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as lo…☆31Feb 13, 2026Updated 2 weeks ago
- ☆67Oct 10, 2024Updated last year
- COMS30053 - High Performance Computing - Lattice Boltzmann☆35Jan 26, 2026Updated last month
- Lagrangian finite-element code for solid mechanics on next-generation computing platforms.☆30Jul 10, 2025Updated 7 months ago