aakaran/reasoning-with-sampling

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aakaran/reasoning-with-sampling)

aakaran / reasoning-with-sampling

☆438

Alternatives and similar repositories for reasoning-with-sampling

Users that are interested in reasoning-with-sampling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

maxzuo / mh-llm
View on GitHub
Fast Metropolis-Hastings sampler for LLMs.
☆26Jan 2, 2026Updated 6 months ago
cvenhoff / thinking-llms-interp
View on GitHub
☆25Jul 8, 2026Updated 2 weeks ago
LeapLabTHU / limit-of-RLVR
View on GitHub
repo for paper https://arxiv.org/abs/2504.13837
☆346Dec 17, 2025Updated 7 months ago
PRIME-RL / TTRL
View on GitHub
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
☆1,103Apr 15, 2026Updated 3 months ago
test-time-training / discover
View on GitHub
☆611May 24, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated 11 months ago
dllm-reasoning / d1
View on GitHub
Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"
☆453Jan 26, 2026Updated 5 months ago
Xuekai-Zhu / FlowRL
View on GitHub
☆180Nov 24, 2025Updated 8 months ago
thunlp / OPD
View on GitHub
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
☆843Jun 29, 2026Updated 3 weeks ago
lasgroup / SDPO
View on GitHub
Reinforcement Learning via Self-Distillation (SDPO)
☆1,021Jul 1, 2026Updated 3 weeks ago
alexOarga / compositional_reasoning
View on GitHub
[NeurIPS'25] Generalizable Reasoning through Compositional Energy Minimization
☆28Oct 28, 2025Updated 8 months ago
yongliang-wu / DFT
View on GitHub
[ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.
☆588Jan 4, 2026Updated 6 months ago
princeton-pli / MeCo
View on GitHub
Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"
☆51Jun 30, 2025Updated last year
microsoft / rStar
View on GitHub
☆1,422Sep 12, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Gen-Verse / dLLM-RL
View on GitHub
[ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.
☆511Jan 28, 2026Updated 5 months ago
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,649Updated this week
ML-GSAI / LLaDA
View on GitHub
Official PyTorch implementation for "Large Language Diffusion Models"
☆3,908Jul 15, 2026Updated last week
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
Silent-Zebra / twisted-smc-lm
View on GitHub
☆35Mar 27, 2025Updated last year
sunblaze-ucb / Intuitor
View on GitHub
[ICLR 2026] Learning to Reason without External Rewards
☆418Jan 26, 2026Updated 5 months ago
ZJU-REAL / InftyThink-Plus
View on GitHub
[ICML 2026] InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning
☆34May 25, 2026Updated last month
sunblaze-ucb / rl-grok-recipe
View on GitHub
Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""
☆35Oct 12, 2025Updated 9 months ago
XiangchengZhang / Diffusion-inference-scaling
View on GitHub
Official Implementation for Inference-time Scaling of Diffusion Models through Classical Search
☆33Oct 8, 2025Updated 9 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
david3684 / flm
View on GitHub
Official Codebase For paper "One-step Language Modeling via Continuous Denoising"
☆160Jul 7, 2026Updated 2 weeks ago
yilundu / ired_code_release
View on GitHub
☆94Jun 14, 2024Updated 2 years ago
CharlesQ9 / Self-Evolving-Agents
View on GitHub
☆1,259Oct 15, 2025Updated 9 months ago
facebookresearch / PhysicsLM4
View on GitHub
Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality
☆356May 20, 2026Updated 2 months ago
ZHZisZZ / dllm
View on GitHub
dLLM: Simple Diffusion Language Modeling
☆2,651Jul 17, 2026Updated last week
michaelbzhu / lora-without-regret
View on GitHub
☆47Oct 23, 2025Updated 9 months ago
idanshen / Self-Distillation
View on GitHub
☆662Apr 7, 2026Updated 3 months ago
thunlp / JustRL
View on GitHub
[ICLR 2026 Blogpost Track Poster] JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
☆292Jun 29, 2026Updated 3 weeks ago
sparkle-reasoning / sparkle
View on GitHub
[NeurIPS'25] Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
☆16Dec 12, 2025Updated 7 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
thu-nics / TaH
View on GitHub
[ICML'26] Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"
☆75Jul 17, 2026Updated last week
NVlabs / Fast-dLLM
View on GitHub
Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"
☆1,063May 30, 2026Updated last month
aHapBean / NITP
View on GitHub
[ICML 2026] NITP: Next Implicit Token Prediction for LLM Pre-training
☆34May 26, 2026Updated last month
Cominclip / OmniVerifier
View on GitHub
[ICLR 2026 Oral & ICML 2026] Generative Universal Verifier as Multimodal Meta-Reasoner
☆64May 29, 2026Updated last month
ChengpengLi1003 / CoRT
View on GitHub
☆72Oct 23, 2025Updated 9 months ago
McGill-NLP / the-markovian-thinker
View on GitHub
Code for paper "The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning"
☆350Mar 16, 2026Updated 4 months ago
JinjieNi / dlms-are-super-data-learners
View on GitHub
The official github repo for "Diffusion Language Models are Super Data Learners".
☆227Nov 6, 2025Updated 8 months ago