Prototype of OpenSHMEM for NVIDIA GPUs, developed as part of DoE Design Forward
☆25Apr 26, 2018Updated 7 years ago
Alternatives and similar repositories for df-nvshmem-prototype
Users that are interested in df-nvshmem-prototype are comparing it to the libraries listed below
Sorting:
- Aries Network Performance Counters Monitoring Library☆11Nov 19, 2020Updated 5 years ago
- GPUDirect Async implementation of HPGMG-FV CUDA☆11May 11, 2018Updated 7 years ago
- CPE change log and release notes☆26Sep 3, 2024Updated last year
- High Performance C++ Turbulent flow Lattice Boltzmann code☆17Sep 19, 2019Updated 6 years ago
- ☆19Jan 17, 2024Updated 2 years ago
- working on a cluster manager for TF☆11Mar 13, 2017Updated 9 years ago
- MPI accelerator-integrated communication extensions☆40Apr 4, 2023Updated 2 years ago
- Guides and examples to help achieve optimal performance on a NVIDIA Grace CPU☆16Aug 9, 2024Updated last year
- Please visit http://nrel.github.io/OpenWARP/ for more information.☆23Nov 18, 2016Updated 9 years ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆30Dec 21, 2024Updated last year
- ☆17Sep 15, 2021Updated 4 years ago
- Fortran 2003 wrappers for POSIX threads☆12Oct 13, 2017Updated 8 years ago
- Effective transpose on Hopper GPU☆28Sep 6, 2025Updated 6 months ago
- This repository contains an implementation for Portals4. Portals4 is a Network Programming Interface which allows high-performance networ…☆14Sep 3, 2024Updated last year
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆122Nov 15, 2023Updated 2 years ago
- An Open-Source Community Supported Fortran layer for AMD HIP☆10May 20, 2020Updated 5 years ago
- GPU implementation of classical molecular dynamics proxy application.☆31Jan 30, 2017Updated 9 years ago
- Comb is a communication performance benchmarking tool.☆26Feb 27, 2023Updated 3 years ago
- Extension to Theano for multi-GPU data parallelism☆20Sep 26, 2017Updated 8 years ago
- QUDA is a library for performing calculations in lattice QCD on GPUs.☆345Updated this week
- https://eth-cscs.github.io/uenv/☆12Nov 18, 2024Updated last year
- Space-Time Variable Code Generator and Solver☆16Nov 2, 2023Updated 2 years ago
- OpenGL based 3D engine with a viewer API for pointcloud, surface meshes☆14May 10, 2023Updated 2 years ago
- Pragmatic, Productive, and Portable Affinity for HPC☆51Mar 8, 2026Updated last week
- A discrete dipole approximation (DDA) implementation for the GPU☆17Feb 15, 2016Updated 10 years ago
- A matlab toolkit to calculate numerical differentiation using WENO5 scheme. Mainly for level set simulation.☆10Jul 11, 2016Updated 9 years ago
- Unstructured computations on emerging architectures.☆14Jun 1, 2022Updated 3 years ago
- A Benchmark for Surface Reconstruction☆12Oct 5, 2015Updated 10 years ago
- MiniAMR Adaptive Mesh Refinement (AMR) Mini-App☆39Nov 12, 2024Updated last year
- C library containing high resolution timer implementation for several platforms.☆10Oct 20, 2020Updated 5 years ago
- A simple pseudo-spectral solver for the Direct Numerical Simulation (DNS) of the 3D Taylor-Green Vortex in the Julia programming language☆10Jun 6, 2022Updated 3 years ago
- ☆11Feb 17, 2026Updated last month
- Backprop with Low-Precision Activations☆11Oct 28, 2019Updated 6 years ago
- General interest repository for CSCS users☆52Feb 20, 2025Updated last year
- VascuSynth: Vascular Tree Synthesis Software☆11Nov 23, 2022Updated 3 years ago
- Variable-density incompressible Navier-Stokes code in F90☆13Nov 16, 2021Updated 4 years ago
- RDMA and SHARP plugins for nccl library☆224Updated this week
- MoSAIC: Modular system for Acceleration Integration MoSAIC☆10Aug 22, 2025Updated 6 months ago
- Implementation of COO, CSR, CSC, SSS and TJDS sparse matrix formats.☆11Jul 15, 2015Updated 10 years ago