hpdps-group / ICS23-GPULZLinks
GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs
☆14Updated last month
Alternatives and similar repositories for ICS23-GPULZ
Users that are interested in ICS23-GPULZ are comparing it to the libraries listed below
Sorting:
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆25Updated 7 months ago
- A GPU accelerated error-bounded lossy compression for scientific data.☆74Updated last week
- The ultimate memory bandwidth benchmark☆50Updated 4 months ago
- TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)☆39Updated last week
- DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression☆11Updated 4 years ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆52Updated 2 months ago
- ☆55Updated 6 years ago
- A unified framework across multiple programming platforms☆38Updated last week
- A task benchmark☆42Updated 10 months ago
- ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs☆83Updated last week
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆35Updated 9 years ago
- Linux Cross-Memory Attach☆94Updated 8 months ago
- GPUDirect Async support for IB Verbs☆117Updated 2 years ago
- Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's☆89Updated last year
- ROCm SPARSE marshalling library☆67Updated this week
- Emulating DMA Engines on GPUs for Performance and Portability☆40Updated 10 years ago
- A hierarchical collective communications library with portable optimizations☆35Updated 6 months ago
- Bandwidth test for ROCm☆56Updated 2 weeks ago
- InstLatX64_Demo☆43Updated 2 weeks ago
- [CF ’20] Verified Instruction-Level Energy Consumption Measurement for NVIDIA GPUs☆15Updated 4 years ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆30Updated 8 months ago
- SYCL Reference Manual☆28Updated last year
- Drishti provides I/O insights to help you improve your application's I/O performance.☆20Updated 3 weeks ago
- Advanced Profiling and Analytics for AMD Hardware☆156Updated this week
- Simplified Interface to Complex Memory☆28Updated last year
- Directed Acyclic Graph Execution Engine (DAGEE) is a C++ library that enables programmers to express computation and data movement, as ta…☆46Updated 3 years ago
- MPI accelerator-integrated communication extensions☆33Updated 2 years ago
- Unit benchmarks of CUDA event APIs.☆17Updated last year
- A tracing infrastructure for heterogeneous computing applications.☆33Updated this week
- TLB Benchmarks☆34Updated 7 years ago