Starmys/TritonStudyGroup

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Starmys/TritonStudyGroup)

Starmys / TritonStudyGroup

☆133

Alternatives and similar repositories for TritonStudyGroup

Users that are interested in TritonStudyGroup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

vllm-project / tml-fa4
View on GitHub
FA4-based Relative Attention Kernel developed by TML and Colfax
☆17Jul 17, 2026Updated last week
xlite-dev / LeetCUDA
View on GitHub
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
☆11,655Updated this week
toyaix / tritonllm
View on GitHub
LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model
☆119Apr 28, 2026Updated 3 months ago
zjhellofss / kuiperbook
View on GitHub
☆17Apr 23, 2026Updated 3 months ago
SiriusNEO / Triton-Puzzles-Lite
View on GitHub
Puzzles for learning Triton, play it with minimal environment configuration!
☆739Mar 17, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HarryWu99 / funny_cute
View on GitHub
Some funny cute/cuteDSL code snippets
☆33Mar 2, 2026Updated 4 months ago
amy-77 / ParisKV
View on GitHub
🔥 [ICML'26] ParisKV: Fast and Drift-Robust KV-Cache Retrieval for Long-Context LLMs
☆30Jun 29, 2026Updated last month
Harry-Chen / fp4_sm120
View on GitHub
Make FP4 on 5090 Great Again
☆17Jul 20, 2026Updated last week
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆6,053Updated this week
hyperai / triton-cn
View on GitHub
Triton Documentation in Chinese Simplified / Triton 中文文档
☆118Mar 5, 2026Updated 4 months ago
thu-ml / SLA
View on GitHub
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
☆324Feb 24, 2026Updated 5 months ago
ArthurinRUC / cutlass-notes
View on GitHub
From Minimal GEMM to Everything
☆230Jul 9, 2026Updated 2 weeks ago
icerain-alt / FSDPToys
View on GitHub
Learning and Debugging for FSDP/FSDP2 Training
☆17Feb 7, 2026Updated 5 months ago
tile-ai / tilelang
View on GitHub
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
☆7,007Updated this week
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
hao-ai-lab / Awesome-Video-Attention
View on GitHub
A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cach…
☆61Oct 27, 2025Updated 9 months ago
tsinghua-ideal / Twilight
View on GitHub
[NeurIPS'25 Spotlight] Adaptive Attention Sparsity with Hierarchical Top-p Pruning
☆105Jul 8, 2026Updated 3 weeks ago
fla-org / flash-linear-attention
View on GitHub
🚀 Efficient implementations for emerging model architectures
☆5,463Updated this week
AkideLiu / MiniCache
View on GitHub
☆14Sep 7, 2024Updated last year
PiotrNawrot / nano-sparse-attention
View on GitHub
The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.
☆92Jul 17, 2025Updated last year
QwenLM / FlashQLA
View on GitHub
high-performance linear attention kernel library built on TileLang
☆616Updated this week
Allen-C-Guan / Pytorch-Inductor-Tutorial
View on GitHub
☆102Jun 26, 2026Updated last month
mlc-ai / pith-train
View on GitHub
Compact and Agent-Native MoE Training System
☆304Updated this week
mit-han-lab / ncu-report-skill
View on GitHub
☆159May 24, 2026Updated 2 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
mit-han-lab / Block-Sparse-Attention
View on GitHub
A sparse attention kernel supporting mix sparse patterns
☆539Jan 18, 2026Updated 6 months ago
vipshop / cache-dit
View on GitHub
A PyTorch-native inference engine with cache, parallelism, quantization and cpu offload for DiTs.
☆1,239Updated this week
TiledTensor / TiledLower
View on GitHub
TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.
☆13Nov 23, 2024Updated last year
CURRENTF / Sparse-vLLM
View on GitHub
A sparse-first inference engine (sparsevllm). It also contains DeltaKV compressor training + evaluation tooling (deltakv).
☆63Updated this week
Dao-AILab / quack
View on GitHub
A Quirky Assortment of CuTe Kernels
☆1,076Updated this week
uiuc-arc / felix
View on GitHub
Optimize tensor program fast with Felix, a gradient descent autotuner.
☆33Mar 5, 2026Updated 4 months ago
cyhdmjzzy / DeepEP-Code-Analysis
View on GitHub
☆26Feb 27, 2026Updated 5 months ago
XunhaoLai / native-sparse-attention-triton
View on GitHub
Efficient triton implementation of Native Sparse Attention.
☆284May 23, 2025Updated last year
GeeeekExplorer / nano-vllm
View on GitHub
Nano vLLM
☆14,679Apr 26, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
CalvinXKY / BasicCUDA
View on GitHub
A tutorial for CUDA&PyTorch
☆481Mar 23, 2026Updated 4 months ago
dsl-learn / triton-tutorial
View on GitHub
Getting Started with Triton: A Tutorial for Python Beginners
☆61Mar 26, 2026Updated 4 months ago
CalvinXKY / InfraTech
View on GitHub
分享AI Infra知识&代码练习：PyTorch、vLLM/SGLang、slime/vime框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等
☆3,193Updated this week
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆732Jul 4, 2026Updated 3 weeks ago
GeeeekExplorer / kkbot
View on GitHub
A Feishu/Lark AI agent bot
☆15Feb 27, 2026Updated 5 months ago
Infini-AI-Lab / S2FT
View on GitHub
☆19Jan 3, 2025Updated last year
mit-han-lab / flash-moba
View on GitHub
☆251Nov 19, 2025Updated 8 months ago