dllm-reasoning / d1Links

Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"

☆255

Alternatives and similar repositories for d1

Users that are interested in d1 are comparing it to the libraries listed below

Sorting:

HKUNLP / DiffuLLaMA
[ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models
☆253Updated 2 months ago
ML-GSAI / SMDM
Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"
☆267Updated 7 months ago
HKUNLP / diffusion-of-thoughts
[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
☆173Updated 5 months ago
ruixin31 / Spurious_Rewards
☆323Updated last week
eric-ai-lab / Soft-Thinking
Official implementation of the paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"
☆204Updated last week
multimodal-art-projection / LatentCoT-Horizon
📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
☆171Updated last week
LeapLabTHU / limit-of-RLVR
repo for paper https://arxiv.org/abs/2504.13837
☆180Updated last month
maple-research-lab / LLaDOU
Large Language Diffusion with Ordered Unmasking
☆44Updated last week
ypwang61 / One-Shot-RLVR
official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”
☆337Updated last week
hanyang1999 / discrete-diffusion-papers
A collection of papers on discrete diffusion models
☆156Updated last month
yczhou001 / Awesome-Diffusion-LLM
paper list, tutorial, and nano code snippet for Diffusion Large Language Models.
☆96Updated last month
LeslieTrue / SFTvsRL
Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
☆289Updated 3 months ago
kuleshov-group / bd3lms
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
☆749Updated 3 weeks ago
sunblaze-ucb / Intuitor
Code for the paper: "Learning to Reason without External Rewards"
☆344Updated 3 weeks ago
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆212Updated 6 months ago
NVlabs / Fast-dLLM
Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"
☆320Updated this week
lucidrains / coconut-pytorch
Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch
☆178Updated last month
kuleshov-group / mdlm
[NeurIPS 2024] Simple and Effective Masked Diffusion Language Model
☆466Updated 2 months ago
ThreeSR / Awesome-Inference-Time-Scaling
Paper List of Inference/Test Time Scaling/Computing
☆286Updated last month
DreamLM / Dream
Dream 7B, a large diffusion language model
☆873Updated last month
ML-GSAI / LLaDA-V
☆188Updated this week
PRIME-RL / ImplicitPRM
Repo of paper "Free Process Rewards without Process Labels"
☆160Updated 4 months ago
LLM360 / Reasoning360
A repo for open research on building large reasoning models
☆84Updated last week
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆234Updated 2 months ago
PRIME-RL / Entropy-Mechanism-of-RL
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆275Updated 3 weeks ago
jxiw / MambaInLlama
[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models
☆225Updated 3 months ago
bansky-cl / diffusion-nlp-paper-arxiv
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".
☆160Updated this week
facebookresearch / PhysicsLM4
Physics of Language Models, Part 4
☆204Updated last week
DreamLM / Dream-Coder
☆30Updated 3 weeks ago
CMU-AIRe / MRT
Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".
☆100Updated 3 weeks ago