DmitryLyakh / CUDA_Tutorial
☆22Updated 4 years ago
Related projects: ⓘ
- CompPhys - a Computational Physics repository☆82Updated 10 months ago
- A C++ library for computing large scale tensor contractions.☆36Updated 6 years ago
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆47Updated last month
- Intermediate MPI lesson☆25Updated last year
- QMCPACK miniapp: a simplified real space QMC code for algorithm development, performance portability testing, and computer science experi…☆27Updated last month
- MagmaDNN: a simple deep learning framework in c++☆45Updated 4 years ago
- Tensor Algebra Library Routines for Shared Memory Systems☆38Updated 9 months ago
- This repository mirrors the principal Gitlab repository of the Chebyshev Accelerated Subspace iteration Eigensolver. If you want to contr…☆15Updated last month
- GPU Eigensolver for symmetric/hermitian matrices.☆64Updated 2 years ago
- Contains sources related to the lectures and labs for the NVIDIA OpenACC course.☆52Updated 4 years ago
- Example codes from the book Parallel Programming With OpenACC☆82Updated 7 years ago
- MiniMD Molecular Dynamics Mini-App☆47Updated last month
- A Massively Parallel FFT Library for CPU/GPU☆54Updated 3 years ago
- ☆19Updated 6 years ago
- This repository contains application codes and solutions for the Book on "OpenACC for Programmers - Concept & Strategies".☆34Updated 5 years ago
- The fftMPI library performs 2d/3d FFTs in parallel for grids distributed across MPI processes.☆13Updated 2 years ago
- Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceler…☆27Updated 2 months ago
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆187Updated 2 months ago
- ALCF Computational Performance Workshop☆33Updated last year
- Implementation of MPI that supports large counts☆44Updated last year
- ☆21Updated 3 years ago
- Offload Eigen operations to GPUs☆17Updated 2 years ago
- LAPACK++ is a C++ wrapper around CPU and GPU LAPACK and LAPACK-like linear algebra libraries, developed as part of the SLATE project.☆46Updated 2 months ago
- Training materials provided by OpenACC.org.☆80Updated last month
- Distributed Communication-Optimal LU-factorization Algorithm☆12Updated 3 years ago
- Lecture and hands-on material for Track 8- Machine Learning of Argonne Training Program on Extreme-Scale Computing☆32Updated last month
- ☆82Updated 7 years ago
- Molecular dynamics proxy application based on Kokkos☆30Updated 2 months ago
- Highly Efficient FFT for Exascale☆35Updated 4 months ago
- DLA-Future☆63Updated this week