tile-ai/tilelang

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tile-ai/tilelang)

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

☆6,871

Alternatives and similar repositories for tilelang

Users that are interested in tilelang are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆6,018Updated this week
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,498Updated this week
deepseek-ai / TileKernels
View on GitHub
A kernel library written in tilelang
☆1,661Apr 23, 2026Updated 3 months ago
HazyResearch / ThunderKittens
View on GitHub
Tile primitives for speedy kernels
☆3,563Jul 13, 2026Updated last week
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,123Updated this week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
deepseek-ai / DeepGEMM
View on GitHub
DeepGEMM: clean and efficient BLAS kernel library on GPU
☆7,554Updated this week
fla-org / flash-linear-attention
View on GitHub
🚀 Efficient implementations for emerging model architectures
☆5,409Updated this week
mirage-project / mirage
View on GitHub
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
☆2,390Updated this week
tile-ai / TileRT
View on GitHub
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
☆1,589Jul 14, 2026Updated last week
Dao-AILab / quack
View on GitHub
A Quirky Assortment of CuTe Kernels
☆1,070Updated this week
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆5,986Updated this week
xlite-dev / LeetCUDA
View on GitHub
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
☆11,621Updated this week
NVIDIA / cutile-python
View on GitHub
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
☆2,121Updated this week
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,778Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
sgl-project / sglang
View on GitHub
SGLang is a high-performance serving framework for large language models and multimodal models.
☆30,706Updated this week
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆732Jul 4, 2026Updated 3 weeks ago
tile-ai / tilelang-puzzles
View on GitHub
Learning TileLang with 10 puzzles!
☆348May 28, 2026Updated last month
bytedance / flux
View on GitHub
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
☆1,345Aug 28, 2025Updated 10 months ago
tile-ai / tilelang-ascend
View on GitHub
Ascend TileLang adapter
☆338Updated this week
Tencent / hpc-ops
View on GitHub
High Performance LLM Inference Operator Library
☆1,063Updated this week
sgl-project / mini-sglang
View on GitHub
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
☆4,628May 17, 2026Updated 2 months ago
GeeeekExplorer / nano-vllm
View on GitHub
Nano vLLM
☆14,627Apr 26, 2026Updated 2 months ago
zhaochenyang20 / Awesome-ML-SYS-Tutorial
View on GitHub
My learning notes for ML SYS.
☆6,768Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
pytorch / helion
View on GitHub
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
☆914Updated this week
flagos-ai / FlagGems
View on GitHub
FlagGems is an operator library for large language models implemented in the Triton Language.
☆1,057Updated this week
HazyResearch / Megakernels
View on GitHub
Kernels, of the mega variety :)
☆786May 26, 2026Updated last month
apache / tvm-ffi
View on GitHub
Open ABI and FFI for Machine Learning Systems
☆435Updated this week
NVIDIA / TransformerEngine
View on GitHub
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on H…
☆3,446Updated this week
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,526Updated this week
deepseek-ai / DeepEP
View on GitHub
DeepEP: an efficient expert-parallel communication library
☆9,885Jul 14, 2026Updated last week
tile-ai / tilescale
View on GitHub
Tile-based language built for AI computation across all scales
☆176Jun 16, 2026Updated last month
THUDM / slime
View on GitHub
slime is an LLM post-training framework for RL Scaling.
☆7,621Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
microsoft / BitBLAS
View on GitHub
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
☆769Aug 6, 2025Updated 11 months ago
deepseek-ai / FlashMLA
View on GitHub
FlashMLA: Efficient Multi-head Latent Attention Kernels
☆12,768Apr 30, 2026Updated 2 months ago
radixark / miles
View on GitHub
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
☆1,784Updated this week
tile-ai / TileOPs
View on GitHub
High-performance LLM operator library built on TileLang.
☆162Updated this week
gpu-mode / Triton-Puzzles
View on GitHub
Puzzles for learning Triton
☆2,537Apr 1, 2026Updated 3 months ago
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,649Updated this week
ai-dynamo / dynamo
View on GitHub
A Datacenter Scale Distributed Inference Serving Framework
☆7,574Updated this week