Implementation of a Transformer, but completely in Triton
☆278Apr 5, 2022Updated 4 years ago
Alternatives and similar repositories for triton-transformer
Users that are interested in triton-transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- GPT, but made only out of MLPs☆89May 25, 2021Updated 4 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper☆81Oct 30, 2021Updated 4 years ago
- Implementation of Multistream Transformers in Pytorch☆54Jul 31, 2021Updated 4 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆220Feb 13, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Implementation of the Remixer Block from the Remixer paper, in Pytorch☆36Sep 27, 2021Updated 4 years ago
- Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI☆99Dec 31, 2021Updated 4 years ago
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…☆1,585Jan 28, 2026Updated 3 months ago
- Another attempt at a long-context / efficient transformer by me☆38Apr 11, 2022Updated 4 years ago
- Contrastive Language-Image Pretraining☆146Sep 6, 2022Updated 3 years ago
- GPTQ inference Triton kernel☆322May 18, 2023Updated 3 years ago
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆600Aug 12, 2025Updated 9 months ago
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Mar 29, 2022Updated 4 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆48Nov 30, 2021Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Implementation of Fast Transformer in Pytorch☆176Aug 26, 2021Updated 4 years ago
- An open source implementation of CLIP.☆33Nov 7, 2022Updated 3 years ago
- Implementation of Flash Attention in Jax☆228Mar 1, 2024Updated 2 years ago
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆120Aug 4, 2021Updated 4 years ago
- Cataloging released Triton kernels.☆302Sep 9, 2025Updated 8 months ago
- ☆30Oct 3, 2022Updated 3 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Apr 6, 2022Updated 4 years ago
- Transformers components but in Triton☆34May 9, 2025Updated last year
- jax-triton contains integrations between JAX and OpenAI Triton☆458Apr 23, 2026Updated 3 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- ☆52Jan 28, 2024Updated 2 years ago
- Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.☆1,480May 2, 2025Updated last year
- Development repository for the Triton language and compiler☆19,184Updated this week
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)☆190Jun 24, 2022Updated 3 years ago
- A concise but complete implementation of CLIP with various experimental improvements from recent papers☆722Oct 16, 2023Updated 2 years ago
- A GPT, made only of MLPs, in Jax☆59Jun 23, 2021Updated 4 years ago
- ☆19Dec 4, 2025Updated 5 months ago
- ☆113Mar 12, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A collection of memory efficient attention operators implemented in the Triton language.☆291Jun 5, 2024Updated last year
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆90Oct 11, 2024Updated last year
- Source code for "N-ary Constituent Tree Parsing with Recursive Semi-Markov Model" published at ACL 2021☆10May 27, 2021Updated 4 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆207Aug 26, 2023Updated 2 years ago
- ☆21Mar 15, 2023Updated 3 years ago
- Experiment of using Tangent to autodiff triton☆82Jan 22, 2024Updated 2 years ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆68Apr 24, 2024Updated 2 years ago