igul222 / plaid
☆78Updated last year
Alternatives and similar repositories for plaid:
Users that are interested in plaid are comparing it to the libraries listed below
- Reparameterized Discrete Diffusion Models for Text Generation☆92Updated last year
- DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆71Updated last month
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆65Updated 11 months ago
- DiffusER: Discrete Diffusion via Edit-based Reconstruction (Reid, Hellendoorn & Neubig, 2022)☆54Updated last year
- ☆116Updated 10 months ago
- Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control☆66Updated 2 years ago
- ☆60Updated 3 weeks ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆97Updated last year
- ☆78Updated 10 months ago
- Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"☆28Updated 2 months ago
- [ICML 2022] Latent Diffusion Energy-Based Model for Interpretable Text Modeling☆64Updated 2 years ago
- Implementation of Self-conditioned Embedding Diffusion for Text Generation☆37Updated 2 years ago
- The official codebase for "Empowering Diffusion Models on the Embedding Space for Text Generation" (NAACL 2024)☆52Updated 8 months ago
- Stick-breaking attention☆41Updated this week
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆62Updated 8 months ago
- Language Quantized AutoEncoders☆95Updated last year
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Updated last year
- Beyond Straight-Through☆93Updated last year
- Code for the paper https://arxiv.org/abs/2205.14987v2☆46Updated 8 months ago
- Simplified Masked Diffusion Language Model☆251Updated last month
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆51Updated last year
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆94Updated 10 months ago
- Directional Preference Alignment☆54Updated 3 months ago
- [ICML 2023] Reflected Diffusion Models (https://arxiv.org/abs/2304.04740)☆157Updated last year
- ☆32Updated last year
- Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)☆59Updated 5 months ago
- ☆33Updated last year
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.☆41Updated 11 months ago
- [ICLR 2023] Official implementation of Transnormer in our ICLR 2023 paper - Toeplitz Neural Network for Sequence Modeling☆76Updated 8 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆58Updated 3 months ago