TiledTensor / TiledKernel
TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.
☆18Updated 4 months ago
Related projects: ⓘ
- Triton to TVM transpiler.☆15Updated last week
- PTX-EMU is a simple emulator for CUDA program.☆21Updated 8 months ago
- ☆14Updated 3 months ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆18Updated last year
- ☆20Updated last year
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆35Updated 3 months ago
- A GPU FP32 computation method with Tensor Cores.☆18Updated last year
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆20Updated 3 months ago
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆53Updated 5 months ago
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆49Updated last month
- ☆14Updated last week
- An MLIR-based toy DL compiler for TVM Relay.☆53Updated last year
- TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.☆114Updated last week
- ☆7Updated last year
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS☆17Updated 2 years ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆13Updated 5 years ago
- ☆28Updated 2 years ago
- HeteroCL-MLIR dialect for accelerator design☆38Updated 3 months ago
- OSDI 2023 Welder, deeplearning compiler☆14Updated 9 months ago
- ☆39Updated 3 years ago
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆100Updated last year
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆24Updated last month
- SpV8 is a SpMV kernel written in AVX-512. Artifact for our SpV8 paper @ DAC '21.☆25Updated 3 years ago
- ☆32Updated 2 years ago
- The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.☆33Updated 2 years ago
- agile hardware-software co-design☆42Updated 2 years ago
- Data-Centric MLIR dialect☆37Updated 11 months ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆27Updated last year
- A language and compiler for irregular tensor programs.☆132Updated 4 months ago