dsharlet / slinkyLinks
Optimize pipelines for locality
☆14Updated 2 weeks ago
Alternatives and similar repositories for slinky
Users that are interested in slinky are comparing it to the libraries listed below
Sorting:
- a compiler for re-writing image processing functions in C++ to Halide☆24Updated 3 years ago
- Cuda matrix computation library that is specified for small matrix operation (3x3, 4x4, 1x3, 1x4, etc.). Including buffer☆18Updated last year
- Reference implementation of the draft C++ GraphBLAS specification.☆32Updated 11 months ago
- ☆14Updated 3 years ago
- A simple, but fast, triangular solver☆18Updated 4 years ago
- Program Generator for Small-Scale Linear Algebra Applications☆32Updated 7 years ago
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆37Updated last week
- Resources for the SIAMCSE21 minitutorial "Automatic Differentiation as a Tool for Computational Science"☆14Updated 4 years ago
- A unified framework across multiple programming platforms☆43Updated this week
- Programmable JIT Compilation and Optimization for C/C++ using LLVM☆41Updated this week
- WIP · CUDA compatibility for Blaze · https://bitbucket.org/blaze-lib/blaze☆21Updated 6 years ago
- Range-based for loops to iterate over a range of numbers or values☆34Updated 9 years ago
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆48Updated 11 years ago
- Counter-based random number generators for C, C++ and CUDA.☆115Updated last year
- High-level C++ for Accelerator Clusters☆154Updated 2 months ago
- Cooperative Primitives for CUDA C++ Kernel Authors. This repository contains CUB PRs from Q4 2019 until Q4 2020.☆22Updated 5 years ago
- ☆24Updated 2 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- A Valgrind tool for Herbie☆97Updated 3 years ago
- data-parallel out-of-core library☆50Updated 2 months ago
- IPython / Jupyter integration for pybind11☆68Updated 8 years ago
- Sample code for our CUDA AMR Iso-Surface Extraction☆14Updated 5 years ago
- FMM Template Library☆45Updated 7 years ago
- C++ library for graph ordering☆15Updated 5 years ago
- Full-speed Array of Structures access☆176Updated 2 years ago
- A project to quickly detect discrepancies in floating point computation across hardware, compilers, libraries and software.☆39Updated last year
- Use CUDA intrinsics with user-defined types☆48Updated 11 years ago
- compilable markdown for linear algebra☆228Updated 2 years ago
- Lock-free parallel disjoint set data structure (aka UNION-FIND) with path compression and union by rank☆67Updated 10 years ago
- ☆16Updated 3 years ago