kisupov / mpres-blas
Multiple-precision GPU accelerated linear algebra routines (dense and sparse) based on residue number system
☆17Updated 2 years ago
Alternatives and similar repositories for mpres-blas:
Users that are interested in mpres-blas are comparing it to the libraries listed below
- cuASR: CUDA Algebra for Semirings☆35Updated 2 years ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆68Updated 2 years ago
- Next generation library for iterative sparse solvers for ROCm platform☆81Updated last week
- FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme☆60Updated last month
- ☆32Updated 4 years ago
- Reference implementation of the draft C++ GraphBLAS specification.☆32Updated 2 months ago
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆72Updated last month
- CUDA Template Functions☆19Updated 4 months ago
- Recursive LAPACK Collection☆42Updated 3 years ago
- Distributed Communication-Optimal LU-factorization Algorithm☆12Updated 3 years ago
- AMD optimized Sparse Linear Algebra library☆29Updated 3 weeks ago
- A GPU algorithm for sparse matrix-matrix multiplication☆70Updated 4 years ago
- Basic Polynomial Algebra Subprograms☆15Updated 3 years ago
- Omni Compiler for C and Fortran programs with XcalableMP and OpenACC directives☆61Updated last year
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆50Updated last month
- Next generation LAPACK implementation for ROCm platform☆100Updated this week
- This repository mirrors the principal Gitlab repository of the Chebyshev Accelerated Subspace iteration Eigensolver. If you want to contr…☆17Updated 3 weeks ago
- ROCm SPARSE marshalling library☆67Updated this week
- sparse matrix pre-processing library☆81Updated last year
- Kernel Tuning Toolkit☆59Updated last month
- ExBLAS: fast, accurate, and reproducible BLAS☆13Updated 3 years ago
- Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceler…☆29Updated 10 months ago
- Linnea is an experimental tool for the automatic generation of optimized code for linear algebra problems.☆69Updated 3 years ago
- High-performance Geometric Multigrid☆35Updated 6 years ago
- A simple, but fast, triangular solver☆17Updated 4 years ago
- nvptx-tools: a collection of tools for use with nvptx-none GCC toolchains.☆50Updated 8 months ago
- ☆39Updated 2 weeks ago
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆108Updated this week
- llvm-project cloned from https://github.com/llvm/llvm-project and modified for VE☆19Updated last week
- MagmaDNN: a simple deep learning framework in c++☆49Updated 4 years ago