aditya12agd5 / cuda_bzip2
GPU Implementation of "Fast Burrows Wheeler Compression Using All-Cores" IPDSW'15
☆14Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for cuda_bzip2
- GPU-Accelerated Lossless Data Compressors Survey☆110Updated 4 years ago
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆35Updated 8 years ago
- Massively Parallel ANS Decoding on GPUs☆28Updated 5 years ago
- ROCm OpenCL Compiler Tool Driver☆24Updated 5 years ago
- TurboRC - Fastest Range Coder + Arithmetic Coding / Fastest Asymmetric Numeral Systems☆71Updated last year
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆107Updated last year
- OpenCL/SPIR-V implementation of HIP☆104Updated 2 years ago
- A lightweight rANSCoder meant for rapid prototyping.☆18Updated last year
- ROCm - AMDGPU Compute Application Binary Interface☆40Updated 2 years ago
- A Benchmark Suite for Heterogeneous System Computation☆52Updated 3 weeks ago
- Next generation FFT implementation for ROCm☆176Updated this week
- rANS coder (derived from https://github.com/rygorous/ryg_rans)☆81Updated 2 years ago
- High performance block-sorting data compression library☆289Updated 9 months ago
- Library for fast image convolution in neural networks on Intel Architecture☆29Updated 7 years ago
- High Performance Linpack for GPUs (Using OpenCL, CUDA, CAL)☆88Updated 9 years ago
- C Framework for OpenCL☆108Updated 10 months ago
- Portable 128-bit SIMD intrinsics☆57Updated last year
- Scalable, Portable and Distributed Gradient Boosting☆28Updated last year
- immintrin_dbg.h is an include file, a wrapper around immintrin.h. It implements most of AVX, AVX2, AVX-512 vector intrinsics to enable so…☆57Updated last year
- A cross-platform (Windows and Linux) CPU memory latency benchmark.☆46Updated 2 years ago
- The HSA-Runtime☆48Updated 8 months ago
- GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs☆14Updated 8 months ago
- OpenCL wrapper for Mathworks Matlab☆18Updated 5 years ago
- ZFP Hardware Implementation☆13Updated last year
- FastAC - Amir Said's Arithmetic and Huffman coding library, example code, and documentation☆28Updated 2 years ago
- ☆18Updated 3 years ago
- A translator from Intel SSE intrinsics to RISCV-V Extension implementation☆17Updated 2 months ago
- Experimental parallel compression algorithm☆23Updated 7 years ago
- ☆32Updated 3 years ago
- A prototype CUDA-to-OpenCL source-to-source translator, built on the Clang compiler framework☆190Updated 4 years ago