nox-410/Welder

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nox-410/Welder)

nox-410 / Welder

OSDI 2023 Welder, deeplearning compiler

☆34

Alternatives and similar repositories for Welder

Users that are interested in Welder are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nox-410 / tvm.tl
View on GitHub
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
☆52Jul 23, 2024Updated 2 years ago
tsinghua-ideal / Syno
View on GitHub
Source code repository for ASPLOS '25 paper "Syno: Structured Synthesis for Neural Operators"
☆15Aug 31, 2025Updated 10 months ago
microsoft / TileFusion
View on GitHub
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
☆115Jun 28, 2025Updated last year
AlibabaResearch / mononn
View on GitHub
☆32Jul 17, 2024Updated 2 years ago
xinhaoc / ferret
View on GitHub
Autonomous CUDA kernel optimization agent with structured task specs and per-config scoring
☆17Jun 17, 2026Updated last month
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
summerspringwei / souffle-ae
View on GitHub
☆17Jan 24, 2024Updated 2 years ago
tile-ai / TileOPs
View on GitHub
High-performance LLM operator library built on TileLang.
☆164Updated this week
monellz / FlashTensor
View on GitHub
☆19Mar 4, 2025Updated last year
KuangjuX / Paper-reading
View on GitHub
My Paper Reading Lists and Notes.
☆25May 8, 2026Updated 2 months ago
tile-ai / tilescale
View on GitHub
Tile-based language built for AI computation across all scales
☆176Updated this week
alibaba / redfuser
View on GitHub
☆21Mar 17, 2026Updated 4 months ago
uwsampl / SparseTIR
View on GitHub
SparseTIR: Sparse Tensor Compiler for Deep Learning
☆145Mar 31, 2023Updated 3 years ago
pku-liang / TileFlow
View on GitHub
TileFlow is a performance analysis tool based on Timeloop for fusion dataflows
☆72Apr 12, 2024Updated 2 years ago
SJTU-ReArch-Group / Paper-Reading-List
View on GitHub
☆154Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
tile-ai / tvm
View on GitHub
Open deep learning compiler stack for cpu, gpu and specialized accelerators
☆20Updated this week
tile-ai / TileFoundry
View on GitHub
☆55Updated this week
eniac / paella
View on GitHub
Paella: Low-latency Model Serving with Virtualized GPU Scheduling
☆72May 1, 2024Updated 2 years ago
TiledTensor / TiledLower
View on GitHub
TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.
☆13Nov 23, 2024Updated last year
HuangShiqing / memory_viz_plus
View on GitHub
☆18Jun 14, 2025Updated last year
microsoft / nnfusion
View on GitHub
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
☆1,002Sep 19, 2024Updated last year
hgl71964 / cuasmrl
View on GitHub
☆19Nov 9, 2024Updated last year
toyaix / triton-runner
View on GitHub
Multi-Level Triton Runner supporting Python, IR, PTX, AMDGCN, cubin and hasco.
☆98May 8, 2026Updated 2 months ago
triton-lang / Triton-to-tile-IR
View on GitHub
incubator repo for CUDA-TileIR backend
☆149Jul 10, 2026Updated 2 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zhaiyi000 / tlm
View on GitHub
☆49Jul 13, 2024Updated 2 years ago
HydraQYH / hp_rms_norm
View on GitHub
High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)
☆30Jan 22, 2026Updated 6 months ago
hao-ai-lab / cse234-w25
View on GitHub
Website for CSE 234, Winter 2025
☆16Mar 24, 2025Updated last year
humuyan / Korch
View on GitHub
ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch
☆41Mar 27, 2025Updated last year
FCAS-LAB / LEGOSIM_MICRO
View on GitHub
☆29Aug 4, 2025Updated 11 months ago
kaist-ina / Trinity-AE
View on GitHub
Source code for Trinity(ASPLOS 2026)
☆26Apr 24, 2026Updated 3 months ago
ChandlerGuan / kperfir_artifact
View on GitHub
☆19May 9, 2025Updated last year
TiledTensor / TiledKernel
View on GitHub
TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.
☆19May 12, 2024Updated 2 years ago
buddy-compiler / buddy-mlir
View on GitHub
An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).
☆745Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
KuangjuX / TileGraph
View on GitHub
TileGraph is an experimental DNN compiler that utilizes static code generation and kernel fusion techniques.
☆11Sep 18, 2024Updated last year
lucifer1004 / VeloQ
View on GitHub
Agent-friendly GPU profile-query CLI
☆106Jun 22, 2026Updated last month
tile-ai / tilelang-benchmark
View on GitHub
☆22Jun 10, 2026Updated last month
uiuc-arc / neptune
View on GitHub
☆28Jun 18, 2026Updated last month
YangLinzhuo / cuda-sgemm-optimization
View on GitHub
CUDA SGEMM optimization note
☆15Oct 31, 2023Updated 2 years ago
nulidangxueshen / CSR2
View on GitHub
A New Format for SIMD-accelerated SpMV
☆22Apr 4, 2022Updated 4 years ago
facebookexperimental / triton
View on GitHub
Github mirror of trition-lang/triton repo.
☆181Updated this week