hpdps-group / ICS23-GPULZ
GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs
☆14Updated 10 months ago
Alternatives and similar repositories for ICS23-GPULZ:
Users that are interested in ICS23-GPULZ are comparing it to the libraries listed below
- A GPU accelerated error-bounded lossy compression for scientific data.☆69Updated this week
- GPUDirect Async support for IB Verbs☆92Updated 2 years ago
- A portable implementation of SZ lossy compression for AMD GPUs and Hygon DCUs.☆7Updated 3 weeks ago
- ☆14Updated 8 months ago
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆22Updated 3 months ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆48Updated this week
- RCCL Performance Benchmark Tests☆55Updated this week
- Unit benchmarks of CUDA event APIs.☆17Updated 8 months ago
- A Multi-purpose, Application-Centric, Scalable I/O Proxy Application☆34Updated 4 years ago
- ☆48Updated 5 years ago
- Bandwidth test for ROCm☆52Updated this week
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- A thin wrapper around miOpen and cuDNN☆40Updated last year
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆47Updated last year
- TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)☆38Updated this week
- Prototype of OpenSHMEM for NVIDIA GPUs, developed as part of DoE Design Forward☆20Updated 6 years ago
- An HPL-AI implementation for Fugaku☆19Updated 3 years ago
- Pytorch process group third-party plugin for UCC☆20Updated 9 months ago
- TLB Benchmarks☆32Updated 7 years ago
- Compute applications.☆24Updated 5 years ago
- ☆42Updated 4 years ago
- pytorch ucc plugin☆18Updated 3 years ago
- A task benchmark☆40Updated 5 months ago
- Linux Cross-Memory Attach☆89Updated 4 months ago
- ☆14Updated 4 years ago
- ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs☆78Updated this week
- InstLatX64_Demo☆41Updated this week
- A GPU performance prediction toolkit for CUDA programs☆16Updated 5 years ago
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆31Updated 3 years ago
- DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression☆11Updated 4 years ago