bansky-cl / Diffusion_NLP_Papers

Listing some diffusion papers in NLP domain I have read, text generation is main, table will continue to be updated.

☆36

Alternatives and similar repositories for Diffusion_NLP_Papers:

Users that are interested in Diffusion_NLP_Papers are comparing it to the libraries listed below

bansky-cl / diffusion-nlp-paper-arxiv
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion_NLP_Papers".
☆79Updated this week
HKUNLP / diffusion-of-thoughts
[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
☆109Updated 11 months ago
ML-GSAI / SMDM
Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"
☆66Updated last month
yegcjs / DiffusionLLM
Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"
☆65Updated last year
zhjgao / difformer
The official codebase for "Empowering Diffusion Models on the Embedding Space for Text Generation" (NAACL 2024)
☆54Updated 9 months ago
AoiDragon / Awesome-Text-Diffusion-Models
[IJCAI'23] The official Github page of the paper "Diffusion Models for Non-autoregressive Text Generation: A Survey".
☆26Updated last year
HKUNLP / reparam-discrete-diffusion
Reparameterized Discrete Diffusion Models for Text Generation
☆94Updated 2 years ago
HKUNLP / DiffuLLaMA
[ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models
☆89Updated 2 months ago
ML-GSAI / RADD
Official PyTorch implementation for "Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data"
☆25Updated 5 months ago
ZetangForward / Bridge_Gap_Diffusion
☆33Updated last year
yegcjs / DINOISER
☆24Updated last year
thu-ml / Noise-Contrastive-Alignment
Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)
☆45Updated 3 months ago
Yuanhy1997 / SeqDiffuSeq
Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation [NAACL 2024]
☆94Updated last year
zhenyuhe00 / BiPE
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024
☆21Updated 7 months ago
igul222 / plaid
☆86Updated last year
justinlovelace / latent-diffusion-for-language
☆122Updated 11 months ago
RUCAIBox / Awesome-Text-Diffusion-Models
[IJCAI'23] The official Github page of the paper "Diffusion Models for Non-autoregressive Text Generation: A Survey".
☆48Updated 8 months ago
RZFan525 / Awesome-ScalingLaws
A curated list of awesome resources dedicated to Scaling Laws for LLMs
☆69Updated last year
VITA-Group / Ms-PoE
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiw…
☆25Updated 9 months ago
gregorbachmann / Next-Token-Failures
☆80Updated 11 months ago
xhan77 / ssd-lm
Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
☆66Updated 2 years ago
GATECH-EIC / ACT
[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…
☆31Updated 7 months ago
dongxiangjue / Awesome-LLM-Self-Improvement
A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …
☆67Updated last month
hanxuhu / SeqIns
The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…
☆29Updated 2 months ago
r-three / smear
☆27Updated last year
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆49Updated 4 months ago
sail-sg / scaling-with-vocab
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
☆77Updated 4 months ago
PKU-ML / LongPPL
☆27Updated 3 months ago
chuanyang-Zheng / DAPE
The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"
☆35Updated 4 months ago
Shwai-He / MEO
The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":
☆35Updated 10 months ago