Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceleration.
☆31Jun 26, 2024Updated last year
Alternatives and similar repositories for spla
Users that are interested in spla are comparing it to the libraries listed below
Sorting:
- The CSCS ReFrame test suite☆16Updated this week
- Porting meshing tools and solvers that deal with unstructured meshes on GPUs☆15Mar 12, 2026Updated last week
- ☆22Feb 26, 2026Updated 3 weeks ago
- DLA-Future☆84Mar 10, 2026Updated last week
- Base container for developing C++ and Fortran HPC applications☆18Jun 14, 2022Updated 3 years ago
- Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.☆13Nov 3, 2023Updated 2 years ago
- Domain specific library for electronic structure calculations☆164Updated this week
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆24Mar 15, 2026Updated last week
- MPI+Kokkos implementation of spectral difference method (SDM) high order schemes☆28Feb 2, 2025Updated last year
- Escoin: Efficient Sparse Convolutional Neural Network Inference on GPUs☆16Feb 28, 2019Updated 7 years ago
- Tensor Algebra for many-body methods☆19Feb 3, 2026Updated last month
- Netlib Scalapack with robust CMake☆14Feb 25, 2026Updated 3 weeks ago
- CSCS public documentation☆30Updated this week
- STREAMer: Benchmarking remote volatile and non-volatile memory bandwidth☆17Aug 21, 2023Updated 2 years ago
- Frame-to-Frame Registration using Gaussian Mixture Models.☆23Mar 2, 2024Updated 2 years ago
- An expression template based linear algebra library running completely on the GPU using CUDA☆25Jun 24, 2021Updated 4 years ago
- A SCVT mesh generation tool☆13Nov 28, 2020Updated 5 years ago
- Recipes for software stacks on Alps vClusters.☆16Updated this week
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆55Jul 25, 2025Updated 7 months ago
- Userspace eBPF Runtime Benchmarking Test Suite and Results☆16Updated this week
- C++17 Wrapper for ScaLAPACK☆11Oct 5, 2023Updated 2 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆31Apr 2, 2025Updated 11 months ago
- An MLIR-based AI compiler designed for Python frontend to RISC-V DSA☆13Oct 10, 2024Updated last year
- C++ library for graph ordering☆15Mar 20, 2020Updated 6 years ago
- DBCSR: Distributed Block Compressed Sparse Row matrix library☆153Updated this week
- A Monte Carlo Neutron Transport Mini-App☆15Apr 15, 2019Updated 6 years ago
- ☆11Aug 8, 2021Updated 4 years ago
- Development/testing repo for SWIG+Fortran☆11Mar 25, 2018Updated 7 years ago
- JIT-compiled GPU kernels for quantum chemistry☆31Jan 30, 2026Updated last month
- LAPACK++ is a C++ wrapper around CPU and GPU LAPACK and LAPACK-like linear algebra libraries, developed as part of the SLATE project.☆76Oct 22, 2025Updated 5 months ago
- OpenMP offload playground☆10Nov 16, 2024Updated last year
- Generic exascale-ready library for halo-exchange operations on variety of grids/meshes☆10Updated this week
- PyTorch implementation of joint coordinate and sparse parametric encodings for offline RGB-D surface reconstruction☆19May 13, 2023Updated 2 years ago
- The Kokkos Fortran Interop repository contains tools and interfaces which help interactions between Fortran portions of an applications a…☆38Mar 12, 2026Updated last week
- Simple small molecular docking and conformation filtering tool.☆13Updated this week
- Massively Asynchronous Coding Environment☆18Oct 21, 2012Updated 13 years ago
- GPU-Accelerated multigrid solver for Poisson's equation in 2D☆29Apr 25, 2021Updated 4 years ago
- ☆17Dec 10, 2018Updated 7 years ago
- ☆14Sep 22, 2019Updated 6 years ago