triton-lang / kernels
☆43Updated this week
Related projects ⓘ
Alternatives and complementary repositories for kernels
- extensible collectives library in triton☆61Updated last month
- PyTorch bindings for CUTLASS grouped GEMM.☆51Updated last week
- Simple and fast low-bit matmul kernels in CUDA / Triton☆133Updated this week
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆87Updated 3 months ago
- Applied AI experiments and examples for PyTorch☆159Updated last week
- ☆162Updated 3 months ago
- Collection of kernels written in Triton language☆63Updated last week
- An experimental CPU backend for Triton☆55Updated last week
- ☆79Updated 2 months ago
- An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).☆196Updated last week
- Fast Hadamard transform in CUDA, with a PyTorch interface☆107Updated 5 months ago
- ☆88Updated 2 months ago
- ☆140Updated this week
- ☆55Updated 5 months ago
- Boosting 4-bit inference kernels with 2:4 Sparsity☆51Updated 2 months ago
- Cataloging released Triton kernels.☆132Updated 2 months ago
- Standalone Flash Attention v2 kernel without libtorch dependency☆98Updated last month
- llama INT4 cuda inference with AWQ☆47Updated 4 months ago
- TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.☆148Updated this week
- A Python library transfers PyTorch tensors between CPU and NVMe☆96Updated this week
- ☆47Updated 2 weeks ago
- GPTQ inference TVM kernel☆35Updated 6 months ago
- ☆41Updated 4 months ago
- Fast Matrix Multiplications for Lookup Table-Quantized LLMs☆183Updated last month
- ☆45Updated last month
- ☆130Updated 3 months ago
- Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity☆177Updated last year
- Materials for learning SGLang☆75Updated this week
- ☆121Updated this week
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆34Updated 2 years ago