toyaix / TritonLLMView external linksLinks
LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model
☆64Oct 18, 2025Updated 3 months ago
Alternatives and similar repositories for TritonLLM
Users that are interested in TritonLLM are comparing it to the libraries listed below
Sorting:
- ☆117Jan 10, 2026Updated last month
- triton for dsa☆57Jan 30, 2026Updated 2 weeks ago
- JAX bindings for the flash-attention3 kernels☆20Jan 2, 2026Updated last month
- Take your first step in writing a compiler. Implemented in Rust.☆16Apr 17, 2023Updated 2 years ago
- FlagTree is a unified compiler supporting multiple AI chip backends for custom Deep Learning operations, which is forked from triton-lang…☆211Updated this week
- ☆27Jan 7, 2025Updated last year
- Getting Started with Triton: A Tutorial for Python Beginners☆35Oct 21, 2025Updated 3 months ago
- Manages vllm-nccl dependency☆17Jun 3, 2024Updated last year
- ☆80Jan 22, 2026Updated 3 weeks ago
- ☆31Jan 28, 2026Updated 2 weeks ago
- PyTorch distributed training acceleration framework☆55Aug 13, 2025Updated 6 months ago
- A minimal implementation of vllm.☆67Jul 27, 2024Updated last year
- DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit☆92Jan 26, 2026Updated 2 weeks ago
- ☆26Aug 19, 2022Updated 3 years ago
- Triton based sparse quantization attention kernel collection☆38Aug 29, 2025Updated 5 months ago
- Tile-based language built for AI computation across all scales☆120Updated this week
- High performance inference engine for diffusion models☆103Sep 5, 2025Updated 5 months ago
- Hands-On Practical MLIR Tutorial☆719Oct 20, 2023Updated 2 years ago
- ☆34Feb 3, 2025Updated last year
- USTC计算物理A☆10Aug 16, 2021Updated 4 years ago
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆13Jan 16, 2026Updated 3 weeks ago
- ☆84Feb 6, 2026Updated last week
- Collection of kernels written in Triton language☆178Jan 27, 2026Updated 2 weeks ago
- 先进编译实验室的个人主页☆197Oct 15, 2025Updated 3 months ago
- hadoop 的 docker 集群配置☆11Jun 8, 2024Updated last year
- This project is based on the [LTX-Video](https://github.com/Lightricks/LTX-Video) algorithm of the diffusers and optimized and accelerate…☆11Dec 31, 2024Updated last year
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- Protocol buffers and other common resources.☆13Jan 20, 2026Updated 3 weeks ago
- ☆10Jun 28, 2025Updated 7 months ago
- python port of arc90's readability bookmarklet, updated to match latest readability.js!☆19Sep 13, 2011Updated 14 years ago
- netbeacon - monitoring your network capture, NIDS or network analysis process☆19Oct 26, 2013Updated 12 years ago
- Stateful LLM Serving☆95Mar 11, 2025Updated 11 months ago
- Distributed Compiler based on Triton for Parallel Systems☆1,350Updated this week
- 机器学习编译 陈天奇☆53Jan 1, 2023Updated 3 years ago
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆87Jan 29, 2026Updated 2 weeks ago
- ☆38Jun 27, 2025Updated 7 months ago
- AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)☆93Jul 14, 2023Updated 2 years ago
- ☆15Jul 18, 2023Updated 2 years ago
- PySOM - The Simple Object Machine Smalltalk implemented in Python☆18Aug 19, 2025Updated 5 months ago