MagmaDNN / magmadnn
MagmaDNN: a simple deep learning framework in c++
☆45Updated 4 years ago
Related projects: ⓘ
- Subset of BLAS routines optimized for NVIDIA GPUs☆63Updated last year
- Next generation library for iterative sparse solvers for ROCm platform☆74Updated this week
- Distributed Communication-Optimal LU-factorization Algorithm☆12Updated 3 years ago
- ☆14Updated 3 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆21Updated last week
- Kokkos Remote Spaces implements distributed Kokkos Views and related APIs for distributed parallel programming.☆42Updated 2 weeks ago
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆34Updated 8 months ago
- Generate simple index ranges in C++ and CUDA C++☆38Updated last year
- sparse matrix pre-processing library☆81Updated 4 months ago
- A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037☆39Updated 7 months ago
- MPI accelerator-integrated communication extensions☆33Updated last year
- resources pour le cours d'introduction à la programmation des GPUs du mastère spécialisé HPC-AI☆22Updated 8 months ago
- Custom-Precision Floating-point numbers.☆28Updated 3 months ago
- A GPU performance prediction toolkit for CUDA programs☆16Updated 5 years ago
- Next generation LAPACK implementation for ROCm platform☆91Updated this week
- A C++ library for computing large scale tensor contractions.☆36Updated 6 years ago
- SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) sy…☆84Updated 2 months ago
- Highly Efficient FFT for Exascale☆35Updated 4 months ago
- RAJA Performance Suite☆110Updated last week
- Next generation SPARSE implementation for ROCm platform☆117Updated this week
- The SparseX sparse kernel optimization library☆39Updated 5 years ago
- Round matrix elements to lower precision in MATLAB☆35Updated 2 years ago
- ☆15Updated 8 months ago
- MiniFE Finite Element Mini-Application☆28Updated 4 months ago
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆47Updated last month
- DLA-Future☆63Updated this week
- The Task-Aware MPI (TAMPI) library extends the functionality of standard MPI libraries by providing new mechanisms for improving the inte…☆23Updated 4 months ago
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆100Updated this week
- A unified framework across multiple programming platforms☆28Updated 2 months ago
- HiCMA: Hierarchical Computations on Manycore Architectures☆27Updated last year