flame / tblis-strassen
Strassen's Algorithm for Tensor Contraction
☆11Updated 7 years ago
Related projects: ⓘ
- QMCPACK miniapp: a simplified real space QMC code for algorithm development, performance portability testing, and computer science experi…☆27Updated last month
- Tensor Contraction Code Generator☆36Updated 7 years ago
- C++ library for tensor computations☆32Updated last year
- ☆14Updated 3 years ago
- The SparseX sparse kernel optimization library☆39Updated 5 years ago
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆47Updated last month
- Communication Avoiding Numerical Dense Matrix Computations☆11Updated 3 years ago
- Tensor Algebra Library Routines for Shared Memory Systems☆38Updated 9 months ago
- A C++ library for computing large scale tensor contractions.☆36Updated 6 years ago
- A scalable eigensolver for dense, symmetric (hermitian) matrices (fork of https://gitlab.mpcdf.mpg.de/elpa/elpa.git)☆27Updated 3 weeks ago
- Distributed-memory, double-precision, polar decomposition (QDWH/ZOLO-PD) of a dense matrix, svd (QDWH/ZOLOPD-SVD) of a dense matrix☆12Updated 4 years ago
- Classical molecular dynamics proxy application.☆28Updated 4 years ago
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆44Updated 9 years ago
- Distributed-parallel C/C++ Tensor Library☆18Updated 5 years ago
- A BUDE virtual-screening benchmark, in many programming models☆24Updated 3 weeks ago
- Basic Tensor Algebra Subroutines☆45Updated last month
- A Task-based Library for Solving Dense Nonsymmetric Eigenvalue Problems☆21Updated last year
- An implementation of ARMCI using MPI one-sided communication (RMA)☆12Updated 3 weeks ago
- Tensor Contraction C++ Library☆50Updated 5 years ago
- Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction☆65Updated 2 weeks ago
- Recursive LAPACK Collection☆42Updated 2 years ago
- Partitioned Global Address Space (PGAS) library for distributed arrays☆97Updated this week
- C++ Header-Only Library for High-Performance Tensor-Vector Multiplication☆19Updated 3 months ago
- Home of ALP/GraphBLAS and ALP/Pregel, featuring shared- and distributed-memory auto-parallelisation of linear algebraic and vertex-centri…☆24Updated this week
- Julia ports of the Rodinia benchmark suite for heterogeneous computing infrastructures☆47Updated last year
- Global Memory and Threading runtime system☆23Updated 4 months ago
- A place to store information for the tensor discussions and possible specifications.☆14Updated 3 months ago
- HiCMA: Hierarchical Computations on Manycore Architectures☆27Updated last year
- sparse matrix pre-processing library☆81Updated 4 months ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆21Updated last week