riakymch / exblasLinks
ExBLAS: fast, accurate, and reproducible BLAS
☆13Updated 3 years ago
Alternatives and similar repositories for exblas
Users that are interested in exblas are comparing it to the libraries listed below
Sorting:
- PaStiX (Parallel Sparse matriX package) solver library☆14Updated 6 years ago
- Recursive LAPACK Collection☆42Updated 3 years ago
- This repository mirrors the principal Gitlab repository of the Chebyshev Accelerated Subspace iteration Eigensolver. If you want to contr…☆18Updated last week
- Linnea is an experimental tool for the automatic generation of optimized code for linear algebra problems.☆70Updated 3 years ago
- Collection of simple General Matrix Multiplication - GEMM implementations☆13Updated last year
- Trust Region Subproblem Solver Library☆20Updated 11 months ago
- Library for chordal matrix computations☆25Updated 6 years ago
- Flexible and performant GEMM kernels in Julia☆82Updated 3 weeks ago
- OptimPack is a library for large optimization problems.☆37Updated 3 months ago
- A hierarchical matrix C/C++ library☆24Updated this week
- Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction☆69Updated 4 months ago
- Julia wrappers for Trilinos☆17Updated 6 years ago
- Julia ports of the Rodinia benchmark suite for heterogeneous computing infrastructures☆53Updated last year
- Sympiler is a Code Generator for Transforming Sparse Matrix Codes☆43Updated 2 years ago
- cuASR: CUDA Algebra for Semirings☆36Updated 2 years ago
- Proof of Concept: a C-callable GPU-enabled parallel 2-D heat diffusion solver written in Julia using CUDA, MPI and graphics☆24Updated 4 years ago
- Development of SuiteSparse.jl, which ships as part of the Julia standard library.☆26Updated 2 years ago
- Fast orthogonal polynomial transforms☆62Updated 10 months ago
- Sparse symmetric indefinite solver implemented with a runtime system☆13Updated 5 years ago
- TensorOperations and cuTENSOR combined☆13Updated 5 years ago
- associative floating point addition☆18Updated last year
- sparse LU factorization and update☆13Updated last year
- Interface for PyTorch's C++ backend, focusing on ATen, AutoGrad, and JIT☆23Updated 9 months ago
- High-Performance Reproducible BLAS using posit arithmetic☆12Updated 3 years ago
- BLAS++ is a C++ wrapper around CPU and GPU BLAS (basic linear algebra subroutines), developed as part of the SLATE project.☆80Updated 2 weeks ago
- ☆30Updated last week
- Automatic GPU, TPU, FPGA, Xeon Phi, Multithreaded, Distributed, etc. offloading for scientific machine learning (SciML) and differential …☆34Updated 3 years ago
- The fastest tropical matrix multiplication in the world!☆30Updated last year
- Parallel solvers for optimization problems☆77Updated 4 years ago
- A version of the STREAM benchmark which measures the sustainable memory bandwidth.☆27Updated 11 months ago