OSDI 2023 Welder, deeplearning compiler
☆33Nov 24, 2023Updated 2 years ago
Alternatives and similar repositories for Welder
Users that are interested in Welder are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆52Jul 23, 2024Updated last year
- ☆17Jan 24, 2024Updated 2 years ago
- My Paper Reading Lists and Notes.☆22Mar 28, 2026Updated last month
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆143Mar 31, 2023Updated 3 years ago
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆66Apr 12, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆108Jun 28, 2025Updated 10 months ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19Apr 24, 2026Updated last week
- ☆19Mar 4, 2025Updated last year
- ☆32Jul 17, 2024Updated last year
- ☆16Nov 9, 2024Updated last year
- Paella: Low-latency Model Serving with Virtualized GPU Scheduling☆71May 1, 2024Updated 2 years ago
- ☆49Jul 13, 2024Updated last year
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆41Mar 27, 2025Updated last year
- TileGraph is an experimental DNN compiler that utilizes static code generation and kernel fusion techniques.☆11Sep 18, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A New Format for SIMD-accelerated SpMV☆22Apr 4, 2022Updated 4 years ago
- CUDA SGEMM optimization note☆15Oct 31, 2023Updated 2 years ago
- Fibertree emulator☆17Nov 4, 2024Updated last year
- Tile-based language built for AI computation across all scales☆143Mar 27, 2026Updated last month
- ☆23Jun 11, 2025Updated 10 months ago
- Framework to reduce autotune overhead to zero for well known deployments.☆99Sep 19, 2025Updated 7 months ago
- ☆150Apr 2, 2026Updated last month
- gem5-X open source project☆18Mar 28, 2023Updated 3 years ago
- pLUTo is a DRAM-based Processing-using-Memory architecture that leverages the high density of DRAM to enable the massively parallel stori…☆18Jan 12, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Multi-Level Triton Runner supporting Python, IR, PTX, AMDGCN, cubin and hasco.☆96Apr 29, 2026Updated last week
- High-performance LLM operator library built on TileLang.☆118Updated this week
- ☆15Nov 12, 2023Updated 2 years ago
- An analytical framework that models hardware dataflow of tensor applications on spatial architectures using the relation-centric notation…☆88Apr 28, 2024Updated 2 years ago
- ☆14Oct 8, 2024Updated last year
- We put all ready-to-go models here☆20Dec 30, 2022Updated 3 years ago
- InfiniTensor is a high-performance inference engine tailored for GPUs and AI accelerators. Its design focuses on effective deployment and…☆310Apr 30, 2026Updated last week
- An open-source efficient deep learning framework/compiler, written in python.☆741Sep 4, 2025Updated 8 months ago
- Compiler for Dynamic Neural Networks☆45Nov 13, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).☆718Apr 30, 2026Updated last week
- LCAI-TIHU SW is a software stack of the AI inference processor based on RISC-V☆23Dec 14, 2022Updated 3 years ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Jun 21, 2019Updated 6 years ago
- A tool for synthesis of Rust code, very early prototype☆13Jan 9, 2024Updated 2 years ago
- 面向多平台编译优化的深度学习中间表示☆10Oct 28, 2024Updated last year
- Development repository for the Triton-Linalg conversion☆218Feb 7, 2025Updated last year
- ☆11Jun 14, 2024Updated last year