pengzhangzhi / Flash-Attention-with-Bias-TritonLinks
Triton Implementation of Flash Attention with Bias.
☆19Updated 8 months ago
Alternatives and similar repositories for Flash-Attention-with-Bias-Triton
Users that are interested in Flash-Attention-with-Bias-Triton are comparing it to the libraries listed below
Sorting:
- Stick-breaking attention☆62Updated 6 months ago
- Official Code for Paper "Think While You Generate: Discrete Diffusion with Planned Denoising" [ICLR 2025]☆84Updated 8 months ago
- Reparameterized Discrete Diffusion Models for Text Generation☆104Updated 2 years ago
- Official Jax Implementation of MD4 Masked Diffusion Models☆151Updated 10 months ago
- ☆111Updated 2 years ago
- [ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"☆87Updated 10 months ago
- ☆83Updated 2 years ago
- Code accompanying the paper "Generalized Interpolating Discrete Diffusion"☆112Updated 7 months ago
- Awesome Triton Resources☆39Updated 8 months ago
- Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control☆76Updated 3 years ago
- Reproduce ICLR2025 Energy-Based Diffusion Language Models for Text Generation☆52Updated 5 months ago
- ☆22Updated 2 years ago
- ☆42Updated 3 years ago
- Triton implementation of FlashAttention2 that adds Custom Masks.☆159Updated last year
- ☆107Updated last year
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se …☆66Updated last year
- Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)☆58Updated last year
- ☆57Updated last year
- Code for the paper: "Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods"☆31Updated 7 months ago
- Flash-Linear-Attention models beyond language☆20Updated 4 months ago
- Code for the paper https://arxiv.org/abs/2402.04997☆102Updated last year
- ☆102Updated 10 months ago
- ☆35Updated last year
- ☆18Updated last year
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆134Updated 3 weeks ago
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Updated last year
- Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".☆248Updated this week
- Educational implementation of the Discrete Flow Matching paper☆127Updated last year
- continous batching and parallel acceleration for RWKV6☆22Updated last year
- Fast and memory-efficient exact attention☆75Updated 10 months ago