aditya12agd5 / cuda_bzip2
GPU Implementation of "Fast Burrows Wheeler Compression Using All-Cores" IPDSW'15
☆14Updated 4 years ago
Related projects: ⓘ
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆35Updated 8 years ago
- GPU-Accelerated Lossless Data Compressors Survey☆110Updated 4 years ago
- ROCm OpenCL Compiler Tool Driver☆24Updated 4 years ago
- Massively Parallel ANS Decoding on GPUs☆26Updated 5 years ago
- OpenCL/SPIR-V implementation of HIP☆104Updated last year
- Massively Parallel Huffman Decoding on GPUs☆40Updated 5 years ago
- C library for the emulation of reduced-precision floating point types☆45Updated last year
- ROCm - AMDGPU Compute Application Binary Interface☆40Updated 2 years ago
- This repository contains documentation for setting up HSA platform, building OpenMP applications using GCC and running on a HSA device☆16Updated 8 years ago
- C Framework for OpenCL☆108Updated 8 months ago
- CMake module to optimize cflags for architecture extensions such as SSE, AVX☆27Updated 2 months ago
- Intel® GPU Compute Samples☆95Updated 4 months ago
- High performance block-sorting data compression library☆281Updated 7 months ago
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆107Updated last year
- Arpra is a C library for analyzing the propagation of numerical error in arbitrary precision IEEE-754 floating-point computations.☆23Updated last year
- OpenSHMEM Application Programming Interface☆51Updated 3 weeks ago
- Next generation FFT implementation for ROCm☆173Updated this week
- A Benchmark Suite for Heterogeneous System Computation☆52Updated last week
- CUDA Waste is a wrapper for emulation of CUDA programs on Windows☆12Updated 8 years ago
- rANS coder (derived from https://github.com/rygorous/ryg_rans)☆80Updated 2 years ago
- ☆74Updated last year
- RISC-V GPGPU☆34Updated 4 years ago
- immintrin_dbg.h is an include file, a wrapper around immintrin.h. It implements most of AVX, AVX2, AVX-512 vector intrinsics to enable so…☆57Updated last year
- Experimental parallel compression algorithm☆23Updated 6 years ago
- Giddy - A lightweight GPU decompression library☆42Updated 5 years ago
- This repository contains my experiments with compression-related algorithms☆35Updated 8 years ago
- TurboRC - Fastest Range Coder + Arithmetic Coding / Fastest Asymmetric Numeral Systems☆70Updated last year
- Information about AVX-512 support on recent Intel processors☆41Updated 2 years ago
- Any code related to AMDGPUs☆8Updated 6 years ago
- A tool for debugging and assessing floating point precision and reproducibility.☆64Updated 4 months ago