dsharlet / slinkyLinks
Optimize pipelines for locality
☆14Updated last month
Alternatives and similar repositories for slinky
Users that are interested in slinky are comparing it to the libraries listed below
Sorting:
- a compiler for re-writing image processing functions in C++ to Halide☆24Updated 2 years ago
- A simple, but fast, triangular solver☆18Updated 4 years ago
- Cuda matrix computation library that is specified for small matrix operation (3x3, 4x4, 1x3, 1x4, etc.). Including buffer☆18Updated last year
- Reference implementation of the draft C++ GraphBLAS specification.☆32Updated 11 months ago
- Program Generator for Small-Scale Linear Algebra Applications☆31Updated 7 years ago
- SPERR is a lossy scientific (floating-point) data compressor that produces one of the best rate-distortion curves.☆25Updated 3 weeks ago
- FMM Template Library☆45Updated 7 years ago
- A header-only compile-time Morton encoding / decoding library for N dimensions.☆114Updated 2 years ago
- Automatic Differentiation for high-performance stencil loops☆13Updated 4 years ago
- Atomistic Spin Simulation Framework☆66Updated 5 years ago
- WIP · CUDA compatibility for Blaze · https://bitbucket.org/blaze-lib/blaze☆21Updated 6 years ago
- ☆14Updated 3 years ago
- Sample code for our CUDA AMR Iso-Surface Extraction☆14Updated 5 years ago
- In-place Parallel Super Scalar Samplesort (IPS⁴o)☆132Updated last year
- Resources for the SIAMCSE21 minitutorial "Automatic Differentiation as a Tool for Computational Science"☆14Updated 4 years ago
- High-level C++ for Accelerator Clusters☆155Updated 2 months ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- A Valgrind tool for Herbie☆97Updated 3 years ago
- Multi-dimensional C++ arrays which store objects in a Struct-of-Arrays (SoA) memory layout for efficient vectorization and zero address g…☆36Updated 5 years ago
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆24Updated 2 months ago
- variant type for CUDA☆12Updated 10 years ago
- ☆138Updated 2 years ago
- A nanobind example project☆119Updated last month
- Range-based for loops to iterate over a range of numbers or values☆34Updated 9 years ago
- C++ multidimensional arrays in the spirit of the STL☆203Updated 8 months ago
- An alternative to Boost.MPI for a user friendly C++ interface for MPI (MPICH).☆19Updated 7 years ago
- Lock-free parallel disjoint set data structure (aka UNION-FIND) with path compression and union by rank☆67Updated 10 years ago
- ☆21Updated 4 years ago
- Skeletonide is a parallel implementation of Zhang-Suen morphological thinning algorithm written in Halide-lang. Use it for fast skeletoni…☆14Updated 5 years ago
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆37Updated 2 months ago