tutorials about polyhedral compilation.
☆61Feb 9, 2026Updated 2 months ago
Alternatives and similar repositories for handson-polyhedral
Users that are interested in handson-polyhedral are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆32Jul 2, 2025Updated 10 months ago
- TensaLang is a Tensor-first programming language, compiler, and runtime that let you write the Model’s inference engine (e.g. LLMs) and s…☆74Feb 20, 2026Updated 2 months ago
- Polyhedral Extraction Tool (source repository: http://repo.or.cz/w/pet.git)☆41Jul 22, 2022Updated 3 years ago
- A retargetable and extensible synthesis-based compiler for modern hardware architectures☆18Nov 20, 2025Updated 5 months ago
- a simple API to use CUPTI☆10Aug 19, 2025Updated 8 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Hands-On Practical MLIR Tutorial☆765Oct 20, 2023Updated 2 years ago
- Work related to vectorizing strategies for arbitrary FHE programs☆10Sep 5, 2025Updated 8 months ago
- Triton Compiler related materials.☆44Mar 16, 2026Updated last month
- Xilinx Modifications to Halide☆13May 3, 2021Updated 5 years ago
- triton for dsa☆63Apr 14, 2026Updated 3 weeks ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆31Dec 21, 2024Updated last year
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆108Jun 28, 2025Updated 10 months ago
- GEMV implementation with CUTLASS☆21Aug 21, 2025Updated 8 months ago
- A fast and accurate reuse distance analyzer for multi-threaded applications. It leverages existing hardware features in commodity CPUs.☆21Feb 3, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆57May 29, 2024Updated last year
- A Python compiler design toolkit.☆523Updated this week
- Sample programs for the LLVM PTX back-end☆41Aug 27, 2015Updated 10 years ago
- [NeurIPS 2025] ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive☆68Dec 11, 2025Updated 4 months ago
- Pluto: An automatic polyhedral parallelizer and locality optimizer☆330Mar 13, 2026Updated last month
- ☆49Jul 13, 2024Updated last year
- We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …☆193Jan 28, 2025Updated last year
- ☆13Mar 6, 2023Updated 3 years ago
- ☆40Apr 27, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Search-based compiler for high-performance DSP programming☆71Oct 29, 2024Updated last year
- MLIR For Beginners tutorial☆1,286Jul 18, 2025Updated 9 months ago
- C/C++ frontend for MLIR. Also features polyhedral optimizations, parallel optimizations, and more!☆612Jun 19, 2025Updated 10 months ago
- Polyhedral High-Level Synthesis in MLIR☆35Mar 17, 2023Updated 3 years ago
- ☆44Oct 15, 2025Updated 6 months ago
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19May 12, 2024Updated last year
- An easy-to-use tensor compiler for FHE☆37Feb 18, 2025Updated last year
- Polyite: Iterative Schedule Optimization for Parallelization in the Polyhedron Model☆12Jan 19, 2020Updated 6 years ago
- Automatic Generation of Benchmarks to Stress-Test Computing Systems.☆43Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The CLooG Code Generator in the Polyhedral Model☆52Jun 26, 2023Updated 2 years ago
- Interprocedural Basic Block Code Layout Optimization☆18Jan 17, 2019Updated 7 years ago
- ☆36Mar 7, 2025Updated last year
- ☆29Apr 7, 2025Updated last year
- Implementing SPMD control flow in LLVM using reconverging CFGs - Vectorizing Divergent Control-Flow for SIMD Applications☆18Apr 11, 2019Updated 7 years ago
- PolyBench/C from http://web.cse.ohio-state.edu/~pouchet/software/polybench/☆19Jan 26, 2016Updated 10 years ago
- A minimal (really) out-of-tree MLIR example☆47Aug 14, 2025Updated 8 months ago