apc-llc / whippletree
Whippletree, a novel approach to scheduling dynamic, irregular workloads on the GPU
☆21Updated 9 years ago
Alternatives and similar repositories for whippletree:
Users that are interested in whippletree are comparing it to the libraries listed below
- ☆64Updated last year
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆50Updated this week
- Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.☆70Updated 9 years ago
- GPU Optimization and Memory Abstraction Framework☆32Updated 5 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- BGHT: High-performance static GPU hash tables.☆62Updated 6 months ago
- Cuda matrix computation library that is specified for small matrix operation (3x3, 4x4, 1x3, 1x4, etc.). Including buffer☆20Updated last year
- Parallel k-D Tree Construction☆57Updated 13 years ago
- CUDA Extension Wrangler☆24Updated 5 years ago
- EGGS, a method to speed up sparse matrix operations when the same sparsity is used for multiple times. This repo contains examples that s…☆25Updated 4 years ago
- GPU B-Tree with support for versioning (snapshots).☆47Updated 4 months ago
- Evaluating different memory managers for dynamic GPU memory☆25Updated 4 years ago
- mallocMC: Memory Allocator for Many Core Architectures☆55Updated last month
- ☆68Updated 2 years ago
- ☆50Updated 5 years ago
- Realtime GPU Profiler for AMD / NVIDIA / Intel GPUs☆32Updated last year
- Code for paper "Engineering a High-Performance GPU B-Tree" accepted to PPoPP 2019☆56Updated 2 years ago
- Visual Computing Library☆20Updated 3 weeks ago
- CUDA Sparse-Matrix Vector Multiplication, using Sliced Coordinate format☆21Updated 6 years ago
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆35Updated 9 years ago
- A fast and highly scalable GPU dynamic memory allocator☆104Updated 10 years ago
- RTX compute samples☆70Updated last year
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆105Updated 7 years ago
- Multi-GPU Framework for Voxel Grid Computations☆47Updated last week
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- Communication-Minimizing 2D Convolution in GPU Registers☆30Updated 11 years ago
- Compute morton keys using a look-up table generated at compile-time.☆31Updated 8 years ago
- LonestarGPU: Irregular algorithms parallelized for GPUs☆34Updated 5 years ago
- nimbus: a cloud computing framework for high performance computations☆25Updated 4 years ago