UmerHA/triton_util

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/UmerHA/triton_util)

UmerHA / triton_util

Make triton easier

☆49

Alternatives and similar repositories for triton_util

Users that are interested in triton_util are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cchan / tccl
View on GitHub
extensible collectives library in triton
☆97Mar 31, 2025Updated last year
meta-pytorch / FACTO
View on GitHub
Framework for Algorithmic Correctness Testing of Operators
☆16Mar 9, 2026Updated 4 months ago
Deep-Learning-Profiling-Tools / triton-samples
View on GitHub
☆14Mar 8, 2025Updated last year
Deep-Learning-Profiling-Tools / triton-viz
View on GitHub
☆351Jul 16, 2026Updated last week
proger / nanokitchen
View on GitHub
Parallel Associative Scan for Language Models
☆18Jan 8, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Jokeren / triton-samples
View on GitHub
☆29Jan 17, 2025Updated last year
haileyschoelkopf / triton-index
View on GitHub
See https://github.com/cuda-mode/triton-index/ instead!
☆11May 8, 2024Updated 2 years ago
AnswerDotAI / cold-compress
View on GitHub
Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of…
☆153Aug 9, 2024Updated last year
meta-pytorch / tritonbench
View on GitHub
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
☆363Updated this week
gpu-mode / triton-index
View on GitHub
Cataloging released Triton kernels.
☆310Sep 9, 2025Updated 10 months ago
Naman-ntc / FastCode
View on GitHub
Utilities for efficient fine-tuning, inference and evaluation of code generation models
☆21Oct 3, 2023Updated 2 years ago
Kernel-Machines / kermac
View on GitHub
Pytorch routines for (Ker)nel (Mac)hines
☆12Oct 10, 2025Updated 9 months ago
stas00 / python-tools
View on GitHub
Python tools
☆14Oct 22, 2023Updated 2 years ago
AllanYangZhou / midGPT
View on GitHub
Distributed pretraining of large language models (LLMs) on cloud TPU slices, with Jax and Equinox.
☆27Sep 29, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
cloneofsimo / min-fsdp
View on GitHub
☆93Jul 5, 2024Updated 2 years ago
BobMcDear / attorch
View on GitHub
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆605May 13, 2026Updated 2 months ago
evanatyourservice / llm-jax
View on GitHub
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆19Jul 24, 2025Updated last year
honnibal / py-clearnlp-converter
View on GitHub
A simple Python wrapper for the ClearNLP constituents-to-dependencies converter
☆11Nov 2, 2015Updated 10 years ago
IntelLabs / EquiTriton
View on GitHub
EquiTriton is a project that seeks to implement high-performance kernels for commonly used building blocks in equivariant neural networks…
☆74May 25, 2026Updated 2 months ago
gpu-mode / triton-tutorials
View on GitHub
☆16May 14, 2025Updated last year
HazyResearch / lolcats
View on GitHub
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
☆260Jan 31, 2025Updated last year
srush / triton-autodiff
View on GitHub
Experiment of using Tangent to autodiff triton
☆81Jan 22, 2024Updated 2 years ago
meta-pytorch / spmd_types
View on GitHub
This module defines a type system for distributed training code, based off of JAX's sharding in types, but adapted for the PyTorch ecosys…
☆35Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
danielvegamyhre / ml-perf-reading-group
View on GitHub
EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)
☆36Mar 20, 2026Updated 4 months ago
zafstojano / policy-gradients
View on GitHub
A minimal hackable implementation of policy gradient methods (GRPO, PPO, REINFORCE)
☆16Feb 20, 2026Updated 5 months ago
flagos-ai / FlagAttention
View on GitHub
A collection of memory efficient attention operators implemented in the Triton language.
☆303Jul 22, 2026Updated last week
rwightman / genalog
View on GitHub
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and te…
☆44May 23, 2026Updated 2 months ago
sustcsonglin / mamba-triton
View on GitHub
☆52Jan 28, 2024Updated 2 years ago
mikex86 / tritonc
View on GitHub
Standalone commandline CLI tool for compiling Triton kernels
☆20Sep 13, 2024Updated last year
srush / mamba-scans
View on GitHub
Blog post
☆17Feb 16, 2024Updated 2 years ago
fpgaminer / GPTQ-triton
View on GitHub
GPTQ inference Triton kernel
☆322May 18, 2023Updated 3 years ago
microsoft / TileFusion
View on GitHub
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
☆115Jun 28, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ivanleomk / modal-grpo
View on GitHub
☆19Mar 16, 2025Updated last year
shawntan / scattermoe
View on GitHub
Triton-based implementation of Sparse Mixture of Experts.
☆281Oct 3, 2025Updated 9 months ago
ekmett / hybrid-vectors
View on GitHub
Hybrid vectors e.g. mixed boxed/unboxed vectors that are suitable for use with vector-algorithms
☆14Aug 29, 2025Updated 10 months ago
dan-zheng / swift
View on GitHub
The Swift Programming Language
☆13Aug 4, 2021Updated 4 years ago
yihong-chen / ReFactorGNN
View on GitHub
Implementation for ReFactor GNNs
☆15Jun 10, 2025Updated last year
gevtushenko / llm.c
View on GitHub
LLM training in simple, raw C/CUDA
☆114May 1, 2024Updated 2 years ago
mstksg / backprop-learn
View on GitHub
Combinators and types for easily building trainable neural networks using the backprop library
☆34Feb 3, 2020Updated 6 years ago