ai-compiler-study / triton-kernels
Triton kernels for Flux
☆17Updated this week
Related projects ⓘ
Alternatives and complementary repositories for triton-kernels
- Writing FLUX in Triton☆30Updated last month
- FlexAttention w/ FlashAttention3 Support☆27Updated last month
- ☆33Updated 6 months ago
- PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu☆38Updated 2 months ago
- ☆31Updated 2 months ago
- ☆76Updated 5 months ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆43Updated last year
- Mixture of A Million Experts☆31Updated 3 months ago
- Make triton easier☆41Updated 5 months ago
- Hacks for PyTorch☆17Updated last year
- ☆46Updated last month
- ☆21Updated 4 months ago
- Faster Pytorch bitsandbytes 4bit fp4 nn.Linear ops☆23Updated 7 months ago
- ☆15Updated 7 months ago
- DPO, but faster 🚀☆21Updated 2 weeks ago
- Implementation of Diffusion Transformers and Rectified Flow in Jax☆20Updated 4 months ago
- ☆17Updated 2 weeks ago
- Experiment of using Tangent to autodiff triton☆72Updated 9 months ago
- ☆26Updated last week
- ☆29Updated 2 years ago
- Official codebase for Margin-aware Preference Optimization for Aligning Diffusion Models without Reference (MaPO).☆60Updated 5 months ago
- Here we will test various linear attention designs.☆56Updated 6 months ago
- ☆28Updated last week
- RS-IMLE