BobMcDear / attorchView external linksLinks
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆596Aug 12, 2025Updated 6 months ago
Alternatives and similar repositories for attorch
Users that are interested in attorch are comparing it to the libraries listed below
Sorting:
- Cataloging released Triton kernels.☆294Sep 9, 2025Updated 5 months ago
- Puzzles for learning Triton☆2,296Nov 18, 2024Updated last year
- Tile primitives for speedy kernels☆3,139Updated this week
- Experiment of using Tangent to autodiff triton☆82Jan 22, 2024Updated 2 years ago
- Applied AI experiments and examples for PyTorch☆315Aug 22, 2025Updated 5 months ago
- Triton-based implementation of Sparse Mixture of Experts.☆265Oct 3, 2025Updated 4 months ago
- Collection of kernels written in Triton language☆178Jan 27, 2026Updated 2 weeks ago
- ☆288Updated this week
- Efficient Triton Kernels for LLM Training☆6,141Updated this week
- Accelerated First Order Parallel Associative Scan☆196Jan 7, 2026Updated last month
- Fast low-bit matmul kernels in Triton☆429Feb 1, 2026Updated last week
- Ring attention implementation with flash attention☆980Sep 10, 2025Updated 5 months ago
- 🚀 Efficient implementations of state-of-the-art linear attention models☆4,379Updated this week
- Helpful tools and examples for working with flex-attention☆1,127Updated this week
- Mirage Persistent Kernel: Compiling LLMs into a MegaKernel☆2,120Jan 29, 2026Updated 2 weeks ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆326Updated this week
- A collection of memory efficient attention operators implemented in the Triton language.☆287Jun 5, 2024Updated last year
- extensible collectives library in triton☆95Mar 31, 2025Updated 10 months ago
- A PyTorch native platform for training generative AI models☆5,045Updated this week
- ☆104Nov 7, 2024Updated last year
- PyTorch native quantization and sparsity for training and inference☆2,668Updated this week
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…☆1,586Jan 28, 2026Updated 2 weeks ago
- train with kittens!☆63Oct 25, 2024Updated last year
- ☆115Aug 26, 2024Updated last year
- Annotated version of the Mamba paper☆496Feb 27, 2024Updated last year
- Minimalistic large language model 3D-parallelism training☆2,544Dec 11, 2025Updated 2 months ago
- ☆124May 28, 2024Updated last year
- Transformers components but in Triton☆34May 9, 2025Updated 9 months ago
- FlashInfer: Kernel Library for LLM Serving☆4,935Updated this week
- Framework to reduce autotune overhead to zero for well known deployments.☆96Sep 19, 2025Updated 4 months ago
- TensorDict is a pytorch dedicated tensor container.☆1,006Updated this week
- Shared Middle-Layer for Triton Compilation☆326Dec 5, 2025Updated 2 months ago
- GPU programming related news and material links☆1,967Sep 17, 2025Updated 4 months ago
- UNet diffusion model in pure CUDA☆661Jun 28, 2024Updated last year
- FlagGems is an operator library for large language models implemented in the Triton Language.☆898Updated this week
- GPTQ inference Triton kernel☆321May 18, 2023Updated 2 years ago
- flash attention tutorial written in python, triton, cuda, cutlass☆486Jan 20, 2026Updated 3 weeks ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆2,076Aug 26, 2025Updated 5 months ago