pengzhangzhi / Flash-Attention-with-Bias-TritonLinks
Triton Implementation of Flash Attention with Bias.
☆14Updated 6 months ago
Alternatives and similar repositories for Flash-Attention-with-Bias-Triton
Users that are interested in Flash-Attention-with-Bias-Triton are comparing it to the libraries listed below
Sorting:
- Official Code for Paper "Think While You Generate: Discrete Diffusion with Planned Denoising" [ICLR 2025]☆81Updated 6 months ago
- Reproduce ICLR2025 Energy-Based Diffusion Language Models for Text Generation☆38Updated 3 months ago
- Official Jax Implementation of MD4 Masked Diffusion Models☆138Updated 8 months ago
- Reparameterized Discrete Diffusion Models for Text Generation☆101Updated 2 years ago
- Code for the paper https://arxiv.org/abs/2402.04997☆97Updated last year
- Inference-Time Alignment in Protein Diffusion Models☆44Updated 9 months ago
- ☆17Updated last year
- ☆108Updated 2 years ago
- Simple Guidance Mechanisms for Discrete Diffusion Models☆57Updated 10 months ago
- Python package for P2 (Path Planning), a masked diffusion model sampling method for sequence generation (protein, text, etc.).☆19Updated 2 months ago
- [ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"☆78Updated 8 months ago
- Retrieved Sequence Augmentation for Protein Representation Learning☆53Updated 2 years ago
- Code for paper: "Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design"☆63Updated 5 months ago
- Pytorch version of Continuous Language Generative Flow (ACL 2021)☆11Updated 4 years ago
- Stick-breaking attention☆61Updated 4 months ago
- ☆39Updated 3 years ago
- Code accompanying the paper "Generalized Interpolating Discrete Diffusion"☆106Updated 4 months ago
- Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".☆210Updated this week
- ☆125Updated last year
- [ICML2025] The official implementation of "WGFormer: An SE(3)-Transformer Driven by Wasserstein Gradient Flows for Molecular Ground-State…☆30Updated 4 months ago
- Derivative-Free Guidance in Diffusion Models with Soft Value-Based Decoding. For controlled generation in DNA, RNA, proteins, molecules (…☆34Updated last year
- [NeurIPS 2024] Simple and Effective Masked Diffusion Language Model☆543Updated last month
- Code for the paper: "Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods"☆27Updated 5 months ago
- Simple Scalable Discrete Diffusion for text in PyTorch☆37Updated last year
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Updated last year
- Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"☆329Updated 10 months ago
- Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control☆75Updated 2 years ago
- The Family of Diffusion Protein Language Models (DPLM)☆273Updated 3 months ago
- Code for the paper https://arxiv.org/abs/2205.14987v2☆56Updated last year
- Any-Order GPT as Masked Diffusion Model: Decoupling Formulation and Architecture. Training an MDM using GPT with this repo!☆27Updated 4 months ago