hpdps-group / cocclLinks

COCCL: Compression and precision co-aware collective communication library

☆30

Alternatives and similar repositories for coccl

Users that are interested in coccl are comparing it to the libraries listed below

Sorting:

szcompressor / FZ-GPU
FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Data on GPUs
☆14Updated 2 years ago
merthidayetoglu / HiCCL
A hierarchical collective communications library with portable optimizations
☆37Updated last year
shixun404 / Fault-Tolerant-SGEMM-on-NVIDIA-GPUs
Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs
☆13Updated 10 months ago
merthidayetoglu / CommBench
A Micro-benchmarking Tool for HPC Networks
☆34Updated 5 months ago
KernelTuner / kernel_launcher
Using C++ magic to capture CUDA kernels and tune them with Kernel Tuner
☆21Updated 4 months ago
mlcommons / hpc
Reference implementations of MLPerf™ HPC training benchmarks
☆49Updated 11 months ago
RIKEN-RCCS / hpl-ai
An HPL-AI implementation for Fugaku
☆23Updated 4 years ago
olcf / NVIDIA-tensor-core-examples
☆20Updated 6 years ago
FZJ-JSC / jubench
JUPITER Benchmark Suite
☆23Updated 6 months ago
sparticlesteve / cosmoflow-benchmark
Benchmark implementation of CosmoFlow in TensorFlow Keras
☆22Updated 2 years ago
ROCm / roc-stdpar
☆18Updated 2 years ago
llnl / hatchet
Graph-indexed Pandas DataFrames for analyzing hierarchical performance data
☆34Updated last week
HAWAIILAB / cuda-flux
CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels
☆32Updated 4 years ago
ROCm / rocSHMEM
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆144Updated this week
NVIDIA / mpi-acx
MPI accelerator-integrated communication extensions
☆39Updated 2 years ago
GMAP / NPB-GPU
NAS Parallel Benchmarks for evaluating GPU and APIs
☆29Updated 4 months ago
argonne-lcf / AIaccelerators-SC23-tutorial
AI Accelerators-SC23-tutorial Repository
☆11Updated 2 years ago
Jokeren / GPA
GPU Performance Advisor
☆65Updated 3 years ago
uuudown / Tartan
Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite
☆68Updated 7 years ago
coreyjadams / CosmicTagger
Cosmic Tagging Network for Neutrino Physics
☆13Updated last year
gpudirect / libmp
Simple message passing library
☆30Updated 7 years ago
eth-cscs / Tiled-MM
Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.
☆32Updated 10 months ago
llnl / mpibind
Pragmatic, Productive, and Portable Affinity for HPC
☆51Updated 3 weeks ago
temporal-hpc / reduction-tensor-cores
Fast GPU based tensor core reductions
☆13Updated 3 years ago
argonne-lcf / THAPI
A tracing infrastructure for heterogeneous computing applications.
☆40Updated last week
ParaStation / psmpi
☆19Updated 3 weeks ago
pnnl / COMET
☆41Updated 4 months ago
NMSU-PEARL / PPT-GPU
Performance Prediction Toolkit for GPUs
☆39Updated 3 years ago
ROCm / rocHPL
High Performance Linpack for Next-Generation AMD HPC Accelerators
☆65Updated 2 months ago
spcl / daceml
A Data-Centric Compiler for Machine Learning
☆85Updated last month