nox-410 / WelderView external linksLinks
OSDI 2023 Welder, deeplearning compiler
☆32Nov 24, 2023Updated 2 years ago
Alternatives and similar repositories for Welder
Users that are interested in Welder are comparing it to the libraries listed below
Sorting:
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆51Jul 23, 2024Updated last year
- ☆17Jan 24, 2024Updated 2 years ago
- My Paper Reading Lists and Notes.☆21Nov 26, 2025Updated 2 months ago
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆66Apr 12, 2024Updated last year
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆142Mar 31, 2023Updated 2 years ago
- ☆18Mar 4, 2025Updated 11 months ago
- TileGraph is an experimental DNN compiler that utilizes static code generation and kernel fusion techniques.☆12Sep 18, 2024Updated last year
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆106Jun 28, 2025Updated 7 months ago
- ☆14Nov 9, 2024Updated last year
- ☆32Jul 17, 2024Updated last year
- CUDA SGEMM optimization note☆15Oct 31, 2023Updated 2 years ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19Updated this week
- ☆48Jul 13, 2024Updated last year
- Paella: Low-latency Model Serving with Virtualized GPU Scheduling☆68May 1, 2024Updated last year
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆39Mar 27, 2025Updated 10 months ago
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19May 12, 2024Updated last year
- ☆23Jun 11, 2025Updated 8 months ago
- A New Format for SIMD-accelerated SpMV☆22Apr 4, 2022Updated 3 years ago
- Framework to reduce autotune overhead to zero for well known deployments.☆96Sep 19, 2025Updated 4 months ago
- Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.☆84Jan 27, 2026Updated 2 weeks ago
- Tile-based language built for AI computation across all scales☆120Feb 8, 2026Updated last week
- ☆25Feb 20, 2024Updated last year
- ☆288Feb 4, 2026Updated last week
- Shared Middle-Layer for Triton Compilation☆326Dec 5, 2025Updated 2 months ago
- ☆145Dec 19, 2025Updated last month
- 分层解耦的深度学习推理引擎☆79Feb 17, 2025Updated 11 months ago
- An open-source efficient deep learning framework/compiler, written in python.☆739Sep 4, 2025Updated 5 months ago
- HeteroCL-MLIR dialect for accelerator design☆42Sep 18, 2024Updated last year
- Tensor Contraction Code Generator☆39Aug 14, 2017Updated 8 years ago
- A domain-specific language (DSL) based on Triton but providing higher-level abstractions.☆41Feb 4, 2026Updated last week
- An analytical framework that models hardware dataflow of tensor applications on spatial architectures using the relation-centric notation…☆87Apr 28, 2024Updated last year
- A language and compiler for irregular tensor programs.☆151Nov 29, 2024Updated last year
- ☆40Feb 28, 2020Updated 5 years ago
- lab solutions of ICS course☆10Jan 20, 2013Updated 13 years ago
- 面向多平台编译优化的深度学习中间表示☆10Oct 28, 2024Updated last year
- ☆20May 24, 2025Updated 8 months ago
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆48Feb 10, 2015Updated 11 years ago
- Automated bottleneck detection and solution orchestration☆19Feb 3, 2026Updated last week
- ☆12Jan 23, 2026Updated 3 weeks ago