Benchmark PyTorch Custom Operators
☆14Jul 6, 2023Updated 2 years ago
Alternatives and similar repositories for epoi
Users that are interested in epoi are comparing it to the libraries listed below
Sorting:
- ☆145Jan 30, 2025Updated last year
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆17Mar 13, 2023Updated 2 years ago
- ☆23Aug 21, 2025Updated 6 months ago
- A schedule language for large model training☆152Aug 21, 2025Updated 6 months ago
- DietCode Code Release☆65Jul 21, 2022Updated 3 years ago
- HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration☆15Sep 14, 2020Updated 5 years ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆143Mar 31, 2023Updated 2 years ago
- HeteroCL-MLIR dialect for accelerator design☆42Sep 18, 2024Updated last year
- DOSA: Differentiable Model-Based One-Loop Search for DNN Accelerators☆19Oct 10, 2024Updated last year
- A language and compiler for irregular tensor programs.☆152Nov 29, 2024Updated last year
- ☆14Nov 7, 2025Updated 3 months ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Apr 15, 2022Updated 3 years ago
- ☆15Oct 26, 2022Updated 3 years ago
- ☆42Sep 8, 2023Updated 2 years ago
- An experimental ahead of time compiler for Relay.☆49Apr 21, 2020Updated 5 years ago
- ☆21Dec 27, 2019Updated 6 years ago
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆25May 12, 2025Updated 9 months ago
- Polyhedral High-Level Synthesis in MLIR☆35Mar 17, 2023Updated 2 years ago
- ☆24Feb 20, 2024Updated 2 years ago
- Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs☆27Jun 25, 2024Updated last year
- Optimize tensor program fast with Felix, a gradient descent autotuner.☆32Apr 27, 2024Updated last year
- The Next-gen Language & Compiler Powering Efficient Hardware Design☆36Jan 16, 2025Updated last year
- A self-contained version of the tutorial which can be easily cloned and viewed by others.☆24Jun 24, 2019Updated 6 years ago
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆121Oct 26, 2022Updated 3 years ago
- TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels☆195Updated this week
- A collection of research papers on efficient training of DNNs☆69Jul 6, 2022Updated 3 years ago
- DASS HLS Compiler☆29Oct 4, 2023Updated 2 years ago
- ☆72Feb 16, 2023Updated 3 years ago
- An awesome curated list of languages and tools to program FPGAs☆73Jun 22, 2022Updated 3 years ago
- Tzer: TVM Implementation of "Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation (OOPSLA'22)“.☆71Mar 9, 2023Updated 2 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆125Jun 23, 2022Updated 3 years ago
- A OpenCL-based FPGA benchmark suite for HPC☆37Jan 29, 2026Updated last month
- Re-implementation of the TASO compiler using equality saturation☆138Jun 28, 2021Updated 4 years ago
- ☆192Mar 28, 2023Updated 2 years ago
- Prefix-Aware Attention for LLM Decoding☆29Jan 23, 2026Updated last month
- ☆82Dec 1, 2023Updated 2 years ago
- A Generic Distributed Auto-Tuning Infrastructure☆24Jul 29, 2021Updated 4 years ago
- ☆250Jul 27, 2025Updated 7 months ago
- ☆41Jun 5, 2024Updated last year