KarypisLab / GKlib
A library of various helper routines and frameworks used by many of the lab's software
☆39Updated 4 months ago
Related projects: ⓘ
- ParMETIS - Parallel Graph Partitioning and Fill-reducing Matrix Ordering☆106Updated 9 months ago
- High-performance Geometric Multigrid☆31Updated 5 years ago
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆97Updated last year
- Local and distributed octrees based on Morton codes with halo discovery and exchange with a 3D collision detection algorithm☆31Updated 2 weeks ago
- Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceler…☆27Updated 2 months ago
- Portable HPC Containers (C++)☆47Updated 2 weeks ago
- Kokkos C++ Performance Portability Programming Ecosystem: Profiling and Debugging Tools☆109Updated 2 weeks ago
- Fast and full-featured Matrix Market I/O library for C++, Python, and R☆73Updated last month
- Kokkos Remote Spaces implements distributed Kokkos Views and related APIs for distributed parallel programming.☆42Updated 2 weeks ago
- LAPACK++ is a C++ wrapper around CPU and GPU LAPACK and LAPACK-like linear algebra libraries, developed as part of the SLATE project.☆46Updated 2 months ago
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆64Updated last month
- MagmaDNN: a simple deep learning framework in c++☆45Updated 4 years ago
- MiniAMR Adaptive Mesh Refinement (AMR) Mini-App☆31Updated 2 months ago
- DLA-Future☆63Updated this week
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆78Updated last month
- A task benchmark☆39Updated last month
- This tool serves as a test harness for different optimization techniques to improve stencil computations performance in shared and distri…☆20Updated last year
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆90Updated 2 years ago
- CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.☆59Updated 2 years ago
- Library for length agnostic SIMD intrinsic support and the corresponding math operations☆20Updated 2 years ago
- data-parallel out-of-core library☆48Updated 2 weeks ago
- SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) sy…☆84Updated 2 months ago
- Generate simple index ranges in C++ and CUDA C++☆38Updated last year
- BLAS++ is a C++ wrapper around CPU and GPU BLAS (basic linear algebra subroutines), developed as part of the SLATE project.☆62Updated 2 months ago
- mallocMC: Memory Allocator for Many Core Architectures☆50Updated 3 weeks ago
- Performance-portable geometric search library☆177Updated this week
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆15Updated this week
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆47Updated last month
- ☆54Updated last year
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆39Updated 8 months ago