inducer / loopyLinks
A code generator for array-based code on CPUs and GPUs
☆616Updated last week
Alternatives and similar repositories for loopy
Users that are interested in loopy are comparing it to the libraries listed below
Sorting:
- The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs☆1,331Updated 7 months ago
- common in-memory tensor structure☆1,098Updated last month
- DaCe - Data Centric Parallel Programming☆559Updated this week
- The Foundation for All Legate Libraries☆232Updated this week
- ☆422Updated this week
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆918Updated last month
- Kernel Tuner☆372Updated last week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆254Updated last week
- The Legion Parallel Programming System☆747Updated last month
- Python wrapper for isl, an integer set library☆80Updated 2 weeks ago
- Pluto: An automatic polyhedral parallelizer and locality optimizer☆308Updated 2 months ago
- POC work on MLIR backend☆61Updated last year
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆211Updated 2 weeks ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆135Updated 2 years ago
- CUSP : A C++ Templated Sparse Matrix Library☆417Updated 3 months ago
- Python SYCL bindings and SYCL-based Python Array API library☆117Updated this week
- Symbolic Expression and Statement Module for new DSLs☆205Updated 5 years ago
- Python interface for MLIR - the Multi-Level Intermediate Representation☆270Updated 11 months ago
- A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python☆333Updated last year
- A Deep Learning Meta-Framework and HPC Benchmarking Library☆81Updated 3 years ago
- CLTune: An automatic OpenCL & CUDA kernel tuner☆182Updated 2 years ago
- A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)☆427Updated 10 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆388Updated last week
- Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX☆222Updated 5 years ago
- Data Parallel Extension for NumPy☆118Updated this week
- ☆247Updated 3 months ago
- STREAM, for lots of devices written in many programming models☆351Updated 2 months ago
- Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)☆131Updated 3 years ago
- GPUOCelot: A dynamic compilation framework for PTX☆288Updated 2 years ago
- The Tensor Algebra SuperOptimizer for Deep Learning☆730Updated 2 years ago