vatai / tadashiLinks
A library for code transformations with guaranteed legality
☆20Updated last week
Alternatives and similar repositories for tadashi
Users that are interested in tadashi are comparing it to the libraries listed below
Sorting:
- Custom-Precision Floating-point numbers.☆41Updated 3 weeks ago
- A lightweight, Pythonic, frontend for MLIR☆80Updated 2 years ago
- CUDA Dynamic Memory Allocator for SOA Data Layout☆38Updated 4 years ago
- ☆29Updated last month
- ☆41Updated 3 months ago
- development repository for the open earth compiler☆81Updated 4 years ago
- An HPL-AI implementation for Fugaku☆23Updated 4 years ago
- cuASR: CUDA Algebra for Semirings☆43Updated 3 years ago
- ☆11Updated 4 years ago
- ☆20Updated 6 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Updated 9 months ago
- FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme☆99Updated last month
- Data-Centric MLIR dialect☆44Updated 2 years ago
- A unified framework across multiple programming platforms☆42Updated 7 months ago
- ☆29Updated 6 years ago
- A Data-Centric Compiler for Machine Learning☆85Updated last month
- 🎃 GPU load-balancing library for regular and irregular computations.☆64Updated 4 months ago
- An MLIR-based source-to-source automatic differentiation system.☆15Updated 2 years ago
- Python wrapper for isl, an integer set library☆82Updated last week
- Next generation library for iterative sparse solvers for ROCm platform☆94Updated this week
- Graph-indexed Pandas DataFrames for analyzing hierarchical performance data☆34Updated 3 weeks ago
- A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code☆15Updated 2 years ago
- A task benchmark☆44Updated last year
- MLIR tools and dialect for GraphBLAS☆18Updated 3 years ago
- Arrow Matrix Decomposition - Communication-Efficient Distributed Sparse Matrix Multiplication☆15Updated last year
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆32Updated 4 years ago
- An MLIR frontend for tensor expressions☆25Updated 5 years ago
- SST Macro Element Library☆36Updated 2 months ago
- Data Dependence Analyzer in the Polyhedral Model☆21Updated 2 years ago
- BLAS implementation for Intel FPGA☆78Updated 5 years ago