lin-toto / recoilLinks
Recoil: Parallel rANS Decoding with Decoder-Adaptive Scalability
☆16Updated 2 years ago
Alternatives and similar repositories for recoil
Users that are interested in recoil are comparing it to the libraries listed below
Sorting:
- GPU B-Tree with support for versioning (snapshots).☆49Updated 8 months ago
- InstLatX64_Demo☆43Updated last month
- Directed Acyclic Graph Execution Engine (DAGEE) is a C++ library that enables programmers to express computation and data movement, as ta…☆46Updated 3 years ago
- ☆58Updated last month
- SYCL Reference Manual☆28Updated last year
- ☆10Updated 5 months ago
- Reference implementation of Deep Neural Network primitives using LIBXSMM's Tensor Processing Primitives (TPP)☆12Updated 3 months ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆55Updated 3 months ago
- Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal, and Rust - all it takes to sum a lot of numbers fast!☆99Updated last month
- ☆48Updated this week
- Intel® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) APIs☆117Updated 2 weeks ago
- Fast CRC32 implementations☆80Updated 3 weeks ago
- ☆26Updated 3 months ago
- A GPU accelerated error-bounded lossy compression for scientific data.☆84Updated last month
- TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)☆42Updated this week
- Fast C header-only library for popcnt, pospopcnt, and set algebraic operations☆45Updated 5 years ago
- A software library of lossless data compression methods tuned and optimized for AMD “Zen”-based CPUs☆29Updated last week
- Code for paper "Engineering a High-Performance GPU B-Tree" accepted to PPoPP 2019☆57Updated 3 years ago
- A fast implementation of log() and exp()☆53Updated 2 years ago
- Bandwidth test for ROCm☆59Updated 2 weeks ago
- A user level library for applications to transparently use Intel DSA.☆38Updated 2 weeks ago
- BGHT: High-performance static GPU hash tables.☆68Updated last week
- A High-Throughput Parallel Lossless Compressor for Scientific Data☆70Updated 2 years ago
- Encapsulate the frequently used AVX instructions as independent modules to reduce repeated development workload.☆123Updated last year
- ☆16Updated 3 months ago
- ALP: Adaptive Lossless Floating-Point Compression☆107Updated 2 months ago
- A library for constructing allocators and memory pools. It also contains broadly useful abstractions and utilities for memory management.…☆66Updated this week
- Distributed ranges is a generalization of C++ ranges for distributed data structures.☆51Updated last week
- ☆37Updated last year
- Linear algebra subroutines for large SSD-resident dense and sparse matrices☆27Updated 4 years ago