hanyang1999 / discrete-diffusion-papersLinks

A collection of papers on discrete diffusion models

☆153

Alternatives and similar repositories for discrete-diffusion-papers

Users that are interested in discrete-diffusion-papers are comparing it to the libraries listed below

Sorting:

ThreeSR / Awesome-Inference-Time-Scaling
Paper List of Inference/Test Time Scaling/Computing
☆286Updated last month
yczhou001 / Awesome-Diffusion-LLM
paper list, tutorial, and nano code snippet for Diffusion Large Language Models.
☆92Updated last month
Joshua-Ren / Learning_dynamics_LLM
☆155Updated 2 months ago
HKUNLP / diffusion-of-thoughts
[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
☆172Updated 5 months ago
ML-GSAI / Diffusion-LLM-Papers
A Collection of Papers on Diffusion Language Models
☆97Updated last month
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆103Updated 3 weeks ago
ruixin31 / Spurious_Rewards
☆322Updated last week
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆212Updated 6 months ago
LINs-lab / DynMoE
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
☆121Updated 3 weeks ago
ML-GSAI / SMDM
Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"
☆267Updated 7 months ago
dllm-reasoning / d1
Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"
☆255Updated last month
HKUNLP / DiffuLLaMA
[ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models
☆253Updated 2 months ago
fscdc / Awesome-Efficient-Reasoning-Models
[arXiv 2025] Efficient Reasoning Models: A Survey
☆244Updated 2 weeks ago
NVlabs / Fast-dLLM
Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"
☆320Updated this week
horseee / dKV-Cache
☆88Updated 2 months ago
OpenSparseLLMs / MoM
☆95Updated 3 months ago
maomaocun / dLLM-cache
Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…
☆132Updated this week
RUCBM / DeepCritic
Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"
☆32Updated last month
zitian-gao / one-shot-em
One-shot Entropy Minimization
☆172Updated last month
horseee / CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆81Updated 5 months ago
Dereck0602 / Awesome_Test_Time_LLMs
☆117Updated 4 months ago
ML-GSAI / LLaDA-V
☆188Updated this week
bansky-cl / Diffusion-LM-Papers
Listing some diffusion papers in NLP domain I have read, text generation is main, table will continue to be updated.
☆59Updated 4 months ago
MingyuJ666 / Rope_with_LLM
[ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…
☆75Updated last month
sustcsonglin / linear-attention-and-beyond-slides
☆79Updated 5 months ago
LeapLabTHU / limit-of-RLVR
repo for paper https://arxiv.org/abs/2504.13837
☆180Updated last month
Chongjie-Si / Subspace-Tuning
A generalized framework for subspace tuning methods in parameter efficient fine-tuning.
☆153Updated last month
yihedeng9 / rlhf-summary-notes
A brief and partial summary of RLHF algorithms.
☆131Updated 5 months ago
Clin0212 / HydraLoRA
[NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning
☆220Updated 8 months ago
GATECH-EIC / ACT
[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…
☆40Updated last year