High-performance, GPU-aware communication library
☆88Dec 16, 2025Updated 3 months ago
Alternatives and similar repositories for Aluminum
Users that are interested in Aluminum are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- High performance NCCL plugin for Bagua.☆15Sep 15, 2021Updated 4 years ago
- Comb is a communication performance benchmarking tool.☆26Feb 27, 2023Updated 3 years ago
- Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction☆71Mar 17, 2025Updated last year
- Livermore Big Artificial Neural Network Toolkit☆229Mar 16, 2026Updated last week
- Distributed Communication-Optimal Shuffle and Transpose Algorithm☆14Feb 20, 2026Updated last month
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Apr 2, 2025Updated 11 months ago
- C++17 Wrapper for ScaLAPACK☆11Oct 5, 2023Updated 2 years ago
- A GPU performance prediction toolkit for CUDA programs☆19Mar 25, 2019Updated 7 years ago
- SN Application Proxy☆52Jun 22, 2022Updated 3 years ago
- CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as lo…☆31Mar 13, 2026Updated last week
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆97Mar 5, 2026Updated 2 weeks ago
- Pragmatic, Productive, and Portable Affinity for HPC☆51Mar 8, 2026Updated 2 weeks ago
- OCCA Python API: JIT Compilation for Multiple Architectures☆11Dec 20, 2019Updated 6 years ago
- High-order Lagrangian Hydrodynamics Miniapp☆201Mar 12, 2026Updated last week
- Near-optimal Prefetching System☆33Nov 17, 2021Updated 4 years ago
- Bagua tutorials.☆13Sep 4, 2022Updated 3 years ago
- Parallel fast Fourier transforms☆59Jan 8, 2019Updated 7 years ago
- DLA-Future☆84Mar 18, 2026Updated last week
- Mini-applications that exclusively use the Kokkos programming model☆12Mar 21, 2023Updated 3 years ago
- This aims to be an wrapper to C-MPI3 for C++, using the principles of simplicity, STL, RAII and Boost and enforcing type-safety. This i…☆23Oct 11, 2024Updated last year
- An application-focused API for memory management on NUMA & GPU architectures☆400Mar 13, 2026Updated last week
- Parallel GDB developed for debugging HPC code at Lawrence Livermore National Laboratory.☆32Nov 3, 2015Updated 10 years ago
- SST DUMPI Trace Library☆14Nov 6, 2023Updated 2 years ago
- Portable HPC Containers (C++)☆49Mar 16, 2026Updated last week
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆37Mar 5, 2026Updated 2 weeks ago
- Large-scale Visualization Data Storage in Python☆20Mar 13, 2026Updated last week
- Damselfly Network Simulator☆10Nov 19, 2020Updated 5 years ago
- A pseudo random number generator library written against the SYCL API.☆11Jun 11, 2019Updated 6 years ago
- Fork of cyclops-community/ctf repository updated haphazardly, previously this was main repo location☆10Aug 7, 2018Updated 7 years ago
- Library for generating C and Fortran bindings for C++ functions from C++☆17Feb 2, 2021Updated 5 years ago
- A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology☆1,355Mar 12, 2026Updated last week
- Unified Collective Communication Library☆297Updated this week
- A dynamic analysis tool to detect floating-point errors in HPC applications.☆40Updated this week
- ☆11Aug 8, 2021Updated 4 years ago
- Kubernetes operator for Bagua distributed training job.☆13Feb 7, 2023Updated 3 years ago
- ☆43Jun 3, 2024Updated last year
- Multidimensional arrays for C++. (Not an official Boost library) \\ This is a mirror of gitlab.com/correaa/boost-multi☆19Mar 17, 2026Updated last week
- Astrophysics MHD simulation code optimized for large cluster of GPU☆58Dec 20, 2024Updated last year
- STREAM, for lots of devices written in many programming models☆358Feb 20, 2026Updated last month