mpi4py / shmem4py
Python bindings for OpenSHMEM
☆16Updated this week
Alternatives and similar repositories for shmem4py:
Users that are interested in shmem4py are comparing it to the libraries listed below
- OpenMP vs Offload☆21Updated last year
- Comb is a communication performance benchmarking tool.☆24Updated 2 years ago
- An MPI ABI compatibility layer☆32Updated last month
- Graph-indexed Pandas DataFrames for analyzing hierarchical performance data☆32Updated 5 months ago
- Tensor Contraction Code Generator☆37Updated 7 years ago
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆53Updated last month
- A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037☆41Updated last year
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆43Updated this week
- My blog.☆27Updated last week
- A BUDE virtual-screening benchmark, in many programming models☆28Updated 6 months ago
- Molecular dynamics proxy application based on Kokkos☆32Updated 9 months ago
- A task benchmark☆41Updated 8 months ago
- A C++based implementation of the TeaLeaf heat conduction mini-app. This implementation of TeaLeaf replicates the functionality of the ref…☆23Updated 8 months ago
- Training examples for SYCL☆40Updated this week
- SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) sy…☆112Updated 3 months ago
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆36Updated last week
- Analyze graph/hierarchical performance data using pandas dataframes☆113Updated 2 months ago
- Department of Energy Standard Utility Library☆31Updated last month
- MPI wrapper generator, for writing PMPI tool libraries☆34Updated last month
- Dynamic execution environments for coupled, thread-heterogeneous MPI+X applications☆21Updated last month
- JUPITER Benchmark Suite☆16Updated 8 months ago
- A project and machine deployment model using Spack☆26Updated 3 weeks ago
- Implementation of MPI that supports large counts☆48Updated 4 months ago
- Sources for the Oak Ridge Leadership Computing Facility User Documentation☆65Updated last week
- Logger for MPI communication☆26Updated last year
- ☆17Updated last year
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Updated 2 weeks ago
- NVIDIA Performance Libraries: Sample code☆20Updated 2 months ago
- Distributed View Extension for Kokkos☆45Updated 4 months ago
- YAKL is A Kokkos Layer: A simple C++ framework for performance portability and Fortran code porting☆65Updated 3 weeks ago