celerity / ndzip
A High-Throughput Parallel Lossless Compressor for Scientific Data
☆58Updated last year
Related projects: ⓘ
- mallocMC: Memory Allocator for Many Core Architectures☆50Updated 3 weeks ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆39Updated 8 months ago
- A fast implementation of log() and exp()☆49Updated last year
- Fast C header-only library for popcnt, pospopcnt, and set algebraic operations☆44Updated 4 years ago
- SYCL Conformance Tests☆60Updated last week
- ☆31Updated 3 years ago
- A Low-Level Abstraction of Memory Access☆79Updated 6 months ago
- ☆68Updated 4 years ago
- Experimental ranges for CUDA☆25Updated 5 years ago
- A fully featured single header library implementing a vector container with a small buffer optimization.☆42Updated 7 months ago
- Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!☆73Updated 4 months ago
- An implementation of HIP that works on CPUs, across OSes.☆109Updated 6 months ago
- Generate simple index ranges in C++ and CUDA C++☆38Updated last year
- Giddy - A lightweight GPU decompression library☆42Updated 5 years ago
- A comparative, extendible benchmarking suite for C and C++ hash-table libraries.☆21Updated 3 months ago
- CPP20 implementation of a 16-bit floating-point type mimicking most of the IEEE 754 behavior. Single file and header-only.☆30Updated 7 months ago
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆78Updated last month
- Distributed ranges is a generalization of C++ ranges for distributed data structures.☆46Updated this week
- ☆17Updated 7 years ago
- High-level C++ for Accelerator Clusters☆139Updated this week
- Lossless compressor of multidimensional floating-point arrays☆104Updated 4 years ago
- Massively Parallel Huffman Decoding on GPUs☆40Updated 5 years ago
- Software implementation of ARM and x86 SIMD intrinsics☆12Updated 5 years ago
- Experimental JSON builder based on C++ reflection☆40Updated last week
- SYCL Open Source Specification☆109Updated this week
- Task graph-based asynchronous programming system using C++ coroutine☆82Updated 7 months ago
- GPGMM, a General-Purpose GPU Memory Management Library.☆32Updated 7 months ago
- A reference implementation of std::simd, providing data parallel types in the C++ standard☆12Updated 4 years ago
- Utilities for accessing AMD's Machine-Readable GPU ISA Specifications.☆15Updated 3 weeks ago
- CUDA executors☆14Updated 3 years ago