inducer / loopyLinks
A code generator for array-based code on CPUs and GPUs
☆608Updated last week
Alternatives and similar repositories for loopy
Users that are interested in loopy are comparing it to the libraries listed below
Sorting:
- DaCe - Data Centric Parallel Programming☆542Updated this week
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆881Updated last week
- common in-memory tensor structure☆1,026Updated last month
- The Foundation for All Legate Libraries☆218Updated this week
- The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs☆1,306Updated 2 months ago
- Kernel Tuner☆351Updated this week
- Python wrapper for isl, an integer set library☆77Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆246Updated this week
- ☆417Updated last week
- Symbolic Expression and Statement Module for new DSLs☆205Updated 4 years ago
- CUSP : A C++ Templated Sparse Matrix Library☆413Updated last month
- The Legion Parallel Programming System☆729Updated last week
- Pluto: An automatic polyhedral parallelizer and locality optimizer☆298Updated 3 weeks ago
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆206Updated 2 months ago
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆546Updated 3 weeks ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆133Updated last year
- CLTune: An automatic OpenCL & CUDA kernel tuner☆180Updated 2 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆261Updated 6 months ago
- Python bindings for UCX☆137Updated last week
- Python interface for MLIR - the Multi-Level Intermediate Representation☆260Updated 7 months ago
- Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX☆220Updated 4 years ago
- Python SYCL bindings and SYCL-based Python Array API library☆114Updated this week
- ☆241Updated 2 weeks ago
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆413Updated last week
- GPUOCelot: A dynamic compilation framework for PTX☆288Updated last year
- A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python☆328Updated 9 months ago
- Assembler for NVIDIA Volta and Turing GPUs☆224Updated 3 years ago
- CUDA Data Parallel Primitives Library☆432Updated 6 years ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆382Updated this week
- Data Parallel Extension for NumPy☆109Updated this week