triton-lang/Triton-to-tile-IR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/triton-lang/Triton-to-tile-IR)

triton-lang / Triton-to-tile-IR

incubator repo for CUDA-TileIR backend

☆151

Alternatives and similar repositories for Triton-to-tile-IR

Users that are interested in Triton-to-tile-IR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

serdes21 / flashtile
View on GitHub
FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.
☆61Feb 6, 2026Updated 5 months ago
NVIDIA / cuda-tile
View on GitHub
CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-base…
☆1,002Jul 22, 2026Updated last week
NVIDIA / TileGym
View on GitHub
Helpful kernel tutorials, examples and SKILLs for tile-based GPU programming
☆785Updated this week
cherichy / tilecute
View on GitHub
☆32Jul 2, 2025Updated last year
NVIDIA / CompileIQ
View on GitHub
An Optimizer for Nvidia Compilers.
☆113Jul 3, 2026Updated 3 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
triton-lang / triton-ext
View on GitHub
A collection of out-of-tree extensions for the Triton language and compiler
☆32Updated this week
YJMSTR / flash-linear-attention
View on GitHub
FLA but cuTile
☆27Apr 17, 2026Updated 3 months ago
facebookexperimental / triton
View on GitHub
Github mirror of trition-lang/triton repo.
☆181Updated this week
microsoft / triton-shared
View on GitHub
Shared Middle-Layer for Triton Compilation
☆340Dec 5, 2025Updated 7 months ago
NVIDIA / tilus
View on GitHub
Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.
☆490Jul 5, 2026Updated 3 weeks ago
pytorch / helion
View on GitHub
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
☆912Updated this week
NVIDIA / cutile-python
View on GitHub
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
☆2,123Updated this week
foundation-model-stack / vllm-triton-backend
View on GitHub
A Triton-only attention backend for vLLM
☆27Jul 14, 2026Updated 2 weeks ago
HydraQYH / hp_rms_norm
View on GitHub
High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)
☆30Jan 22, 2026Updated 6 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
toyaix / triton-runner
View on GitHub
Multi-Level Triton Runner supporting Python, IR, PTX, AMDGCN, cubin and hasco.
☆99May 8, 2026Updated 2 months ago
meta-pytorch / tritonbench
View on GitHub
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
☆363Updated this week
zhuzilin / flash-attention-with-sink
View on GitHub
☆37Aug 7, 2025Updated 11 months ago
NTT123 / cute-viz
View on GitHub
Cute layout visualization
☆44Jan 18, 2026Updated 6 months ago
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,503Jul 20, 2026Updated last week
Mogball / triton_lite
View on GitHub
☆20May 24, 2025Updated last year
tile-ai / TileOPs
View on GitHub
High-performance LLM operator library built on TileLang.
☆165Updated this week
flashinfer-ai / cubloaty
View on GitHub
a size profiler for cuda binary
☆71Jan 15, 2026Updated 6 months ago
flagos-ai / libtriton_jit
View on GitHub
A Triton JIT runtime and ffi provider in C++
☆37Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Dao-AILab / quack
View on GitHub
A Quirky Assortment of CuTe Kernels
☆1,081Updated this week
tile-ai / tilescale
View on GitHub
Tile-based language built for AI computation across all scales
☆178Updated this week
microsoft / TileFusion
View on GitHub
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
☆115Jun 28, 2025Updated last year
dsl-learn / cutile-learn
View on GitHub
NVIDIA cuTile learn
☆169Dec 9, 2025Updated 7 months ago
flashinfer-ai / flashinfer-bench
View on GitHub
Building the Virtuous Cycle for AI-driven LLM Systems
☆261May 1, 2026Updated 3 months ago
toyaix / tritonllm
View on GitHub
LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model
☆119Apr 28, 2026Updated 3 months ago
meta-pytorch / tritonparse
View on GitHub
TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels
☆211Jul 15, 2026Updated 2 weeks ago
ColfaxResearch / layout-categories
View on GitHub
This repository contains companion software for the Colfax Research paper "Categorical Foundations for CuTe Layouts".
☆140Sep 24, 2025Updated 10 months ago
DeepLink-org / DLCompiler
View on GitHub
triton for dsa
☆68Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Deep-Learning-Profiling-Tools / triton-viz
View on GitHub
☆351Jul 16, 2026Updated 2 weeks ago
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆734Jul 4, 2026Updated 3 weeks ago
monellz / FlashTensor
View on GitHub
☆19Mar 4, 2025Updated last year
flagos-ai / FlagTree
View on GitHub
FlagTree is a unified compiler supporting multiple AI chip backends for custom Deep Learning operations, which is forked from triton-lang…
☆305Updated this week
meta-pytorch / MSLK
View on GitHub
MSLK (Meta Superintelligence Labs Kernels) is a collection of PyTorch GPU operator libraries that are designed and optimized for GenAI tr…
☆123Updated this week
IBM / triton-dejavu
View on GitHub
Framework to reduce autotune overhead to zero for well known deployments.
☆102Sep 19, 2025Updated 10 months ago
apache / tvm-ffi
View on GitHub
Open ABI and FFI for Machine Learning Systems
☆439Updated this week