inducer / loopy
A code generator for array-based code on CPUs and GPUs
☆602Updated last week
Alternatives and similar repositories for loopy:
Users that are interested in loopy are comparing it to the libraries listed below
- The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs☆1,293Updated 3 weeks ago
- Stretching GPU performance for GEMMs and tensor contractions.☆237Updated 2 weeks ago
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆869Updated last week
- common in-memory tensor structure☆983Updated 3 weeks ago
- CUSP : A C++ Templated Sparse Matrix Library☆412Updated 6 months ago
- Kernel Tuner☆331Updated this week
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆533Updated last month
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm