yangzhch6/DARS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yangzhch6/DARS)

yangzhch6 / DARS

The official implemention of "Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration" (ICML 2026)

☆24

Alternatives and similar repositories for DARS

Users that are interested in DARS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LARK-AI-Lab / CodeScaler
View on GitHub
The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"
☆35Mar 26, 2026Updated 4 months ago
AI4fun / DQ-LoRe
View on GitHub
☆13Jun 26, 2024Updated 2 years ago
yangzhch6 / ReSocratic
View on GitHub
OptiBench and ReSocratic Synthesis Method
☆35Oct 2, 2025Updated 9 months ago
MasterVito / SvS
View on GitHub
Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training
☆54Dec 13, 2025Updated 7 months ago
rookie-joe / FormalAlign
View on GitHub
☆17Jul 12, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Yingjia-Wan / FaStfact
View on GitHub
Code repo for FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs.
☆33Nov 5, 2025Updated 8 months ago
LARK-AI-Lab / EnvFactory
View on GitHub
The official paper for EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL.
☆85Jun 5, 2026Updated last month
MasterVito / SwS
View on GitHub
Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning
☆42Nov 11, 2025Updated 8 months ago
zzli2022 / TLDR
View on GitHub
Code for Research Project TLDR
☆26Jul 28, 2025Updated 11 months ago
Luo-Yihong / Reward-Instruct
View on GitHub
[NeurIPS 2025] Reward-Instruct: A Reward-Centric Approach to Fast Photo-Realistic Image Generation
☆35Oct 24, 2025Updated 9 months ago
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
RUCAIBox / Passk_Training
View on GitHub
The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''
☆113Aug 15, 2025Updated 11 months ago
rookie-joe / PDA
View on GitHub
☆36Jan 10, 2025Updated last year
sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆65Jan 5, 2026Updated 6 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Luo-Yihong / TDM-R1
View on GitHub
[ICML 2026][Ultra Powerful Few-Step Diffusion RL] TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
☆116May 25, 2026Updated 2 months ago
draw2think / harness-geometry
View on GitHub
Implementation code for the paper "Draw2Think: Harnessing Geometry Reasoning through Constraint Engine Interaction"
☆17May 28, 2026Updated last month
kkk-an / UltraIF
View on GitHub
Code of EMNLP 2025 paper 'UltraIF: Advancing Instruction Following from the Wild'.
☆21Apr 3, 2025Updated last year
EffiBench / EffiBench-X
View on GitHub
[NeurIPS'25] EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code
☆15Oct 24, 2025Updated 9 months ago
multimodal-art-projection / TreePO
View on GitHub
☆65Mar 30, 2026Updated 3 months ago
wangywUST / DeepEdit
View on GitHub
Repository for our paper "DeepEdit: Knowledge Editing as Decoding with Constraints". https://arxiv.org/abs/2401.10471
☆21Jun 19, 2024Updated 2 years ago
chen-judge / UniGeo
View on GitHub
[EMNLP 22] UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression
☆34Dec 7, 2022Updated 3 years ago
THUDM / TreeRL
View on GitHub
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
☆97Jun 16, 2025Updated last year
zhaoyu-li / PyEuclid
View on GitHub
[CAV 2025] PyEuclid: A Versatile Formal Plane Geometry System in Python
☆15Jun 27, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 10 months ago
AIoT-MLSys-Lab / MMDeepResearch-Bench
View on GitHub
MMDeepResearch-Bench (MMDR)
☆31Apr 1, 2026Updated 3 months ago
Linear95 / APO
View on GitHub
Code for ACL2024 paper - Adversarial Preference Optimization (APO).
☆54Jun 3, 2024Updated 2 years ago
BitSecret / HyperGNet
View on GitHub
Geometric Problem Solving Integrating FormalGeo Symbolic System and Hypergraph Neural Network.
☆16Sep 23, 2025Updated 10 months ago
Luo-Yihong / TDM
View on GitHub
[ICCV 2025][Few-Step Student Surpasses Teacher Diffusion] Learning Few-Step Diffusion Models by Trajectory Distribution Matching
☆99Mar 16, 2026Updated 4 months ago
sylvain-wei / TIME
View on GitHub
[NeurIPS 2025 D&B (Spotlight🌟)] TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario
☆32Oct 5, 2025Updated 9 months ago
Noahs-ARK / PaLM
View on GitHub
PyTorch implementation for PaLM: A Hybrid Parser and Language Model.
☆10Jan 7, 2020Updated 6 years ago
Jiahao004 / DeepTheorem
View on GitHub
☆27Jun 10, 2025Updated last year
rdi-berkeley / awesome-RLVR-boundary
View on GitHub
A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Langu…
☆89Dec 12, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
seamoke / DPH-RL
View on GitHub
This is the official implementation of paper "The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement…
☆20Feb 10, 2026Updated 5 months ago
xinhjBrant / APE-Bench_I
View on GitHub
☆25Feb 3, 2026Updated 5 months ago
Huawei-AI4Math / ProofFlow
View on GitHub
☆23Jun 28, 2026Updated 3 weeks ago
alchemistyzz / PeRL
View on GitHub
[NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"
☆30Mar 30, 2026Updated 3 months ago
He-Ren / OJBench
View on GitHub
☆32Feb 28, 2026Updated 4 months ago
EvanZhuang / mixinputs
View on GitHub
Official implementation for Text Generation Beyond Discrete Token Sampling
☆26Aug 11, 2025Updated 11 months ago
laiguokun / DSGC
View on GitHub
☆10Dec 21, 2019Updated 6 years ago