gptune / GPTune
☆69Updated last month
Alternatives and similar repositories for GPTune:
Users that are interested in GPTune are comparing it to the libraries listed below
- HiCMA: Hierarchical Computations on Manycore Architectures☆30Updated 2 years ago
- Tensor Contraction Code Generator☆36Updated 7 years ago
- SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) sy…☆110Updated 2 months ago
- Round matrix elements to lower precision in MATLAB☆36Updated 2 years ago
- A searchable Python interface to the SuiteSparse Matrix Collection☆45Updated 2 years ago
- ☆15Updated 3 years ago
- XBraid Parallel-in-Time Solvers☆76Updated 6 months ago
- Library of GPU-resident linear solvers☆60Updated last week
- H2Opus: a performance-oriented library for hierarchical matrices☆13Updated 2 years ago
- A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037☆40Updated last year
- Graph-indexed Pandas DataFrames for analyzing hierarchical performance data☆31Updated 4 months ago
- ytopt: machine-learning-based autotuning and hyperparameter optimization framework using Bayesian Optimization☆48Updated last week
- Distributed Communication-Optimal LU-factorization Algorithm☆12Updated 3 years ago
- Next generation library for iterative sparse solvers for ROCm platform☆78Updated this week
- RAJA Performance Suite☆118Updated this week
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆36Updated 6 months ago
- Zoltan Dynamic Load Balancing and Graph Algorithm Toolkit -- Distribution site☆34Updated last year
- Fast gradient evaluation in C++ based on Expression Templates.☆94Updated this week
- Lecture and hands-on material for Track 8- Machine Learning of Argonne Training Program on Extreme-Scale Computing☆37Updated 7 months ago
- ParMETIS - Parallel Graph Partitioning and Fill-reducing Matrix Ordering☆131Updated last year
- Structured Matrix Package (LBNL)☆173Updated 3 months ago
- MagmaDNN: a simple deep learning framework in c++☆49Updated 4 years ago
- GPU accelerated multigrid library for Python☆56Updated 5 months ago
- An implementation of the 1. Parallel, 2. Streaming, 3. Randomized SVD using MPI4Py☆58Updated 3 years ago
- The parGeMSLR is an MPI-based sparse linear system solution/preconditioning package implementation with C++.☆25Updated 2 years ago
- PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core …☆55Updated last week
- H2 Matrix Package☆29Updated last year
- Benchmark implementation of CosmoFlow in TensorFlow Keras☆21Updated last year
- resources pour le cours d'introduction à la programmation des GPUs du mastère spécialisé HPC-AI☆22Updated last year
- Highly Efficient FFT for Exascale☆37Updated 10 months ago