StigLidu/DualDistill

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/StigLidu/DualDistill)

StigLidu / DualDistill

[EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"

☆104

Alternatives and similar repositories for DualDistill

Users that are interested in DualDistill are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

StigLidu / AdaExplore
View on GitHub
The official implementation for paper "AdaExplore: Failure-Driven Adaptation and Diversity-Preserving Search for Efficient Kernel Generat…
☆22Jul 12, 2026Updated last week
StigLidu / TURN
View on GitHub
[ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"
☆23Feb 16, 2025Updated last year
StigLidu / CodeGym
View on GitHub
[ICLR2026] The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym"
☆35Oct 14, 2025Updated 9 months ago
ChengpengLi1003 / CoRT
View on GitHub
☆72Oct 23, 2025Updated 9 months ago
TEAM-ARM / arm
View on GitHub
[NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model
☆68Apr 6, 2026Updated 3 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
prometheus-eval / scaling-evaluation-compute
View on GitHub
Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"
☆12Mar 25, 2025Updated last year
MasterVito / SwS
View on GitHub
Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning
☆42Nov 11, 2025Updated 8 months ago
matt-seb-ho / arc_memo
View on GitHub
☆45Dec 15, 2025Updated 7 months ago
jiwonsong-dev / ReasoningPathCompression
View on GitHub
[NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"
☆32Oct 20, 2025Updated 9 months ago
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated last year
RefineBench / refinebench-eval
View on GitHub
Official code and dataset for our paper: RefineBench: Evaluating Refinement Capability of Language Models via Checklists
☆17Dec 1, 2025Updated 7 months ago
liushulinle / UloRL
View on GitHub
An Ultra-Long Output Reinforcement Learning Approach
☆23Jul 31, 2025Updated 11 months ago
facebookresearch / sweet_rl
View on GitHub
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆271May 5, 2025Updated last year
ChengpengLi1003 / Awesome-Long-Chain-of-Thought-Reasoning-with-tools
View on GitHub
A curated list of cutting-edge research papers and resources on Long Chain-of-Thought (CoT) Reasoning with Tools.
☆46Dec 17, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TIGER-AI-Lab / General-Reasoner
View on GitHub
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆228Nov 27, 2025Updated 7 months ago
SparksJoe / Prism
View on GitHub
A Framework for Decoupling and Assessing the Capabilities of VLMs
☆44Jun 28, 2024Updated 2 years ago
uq-project / UQ
View on GitHub
UQ: Assessing Language Models on Unsolved Questions
☆30Aug 26, 2025Updated 10 months ago
mll-lab-nu / RAGEN
View on GitHub
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
☆2,753Apr 14, 2026Updated 3 months ago
camel-ai / gecko
View on GitHub
☆35Jul 8, 2026Updated 2 weeks ago
wuxiyang1996 / COS-PLAY
View on GitHub
COS-PLAY: Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Game Play
☆30Jul 11, 2026Updated last week
DripNowhy / Sherlock
View on GitHub
[NeurIPS 2025] Official Implementation of paper "Sherlock: Self-Correcting Reasoning in Vision-Language Models"
☆31Jun 4, 2026Updated last month
complex-reasoning / RPG
View on GitHub
[ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)
☆76Jun 29, 2026Updated 3 weeks ago
ltzheng / SimpleTIR
View on GitHub
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆401Mar 30, 2026Updated 3 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
rookie-joe / FormalAlign
View on GitHub
☆17Jul 12, 2025Updated last year
yyht / openrlhf_async_pipline
View on GitHub
☆90Aug 16, 2025Updated 11 months ago
open-compass / ProSA
View on GitHub
[EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs
☆29May 22, 2025Updated last year
HJYao00 / MMReason
View on GitHub
[ICCV 2025] MMReason, MLLMs, step by step, reasoning benchmark, AGI
☆15Apr 25, 2026Updated 2 months ago
xlang-ai / BRIGHT
View on GitHub
[ICLR 2025] BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
☆206Sep 13, 2025Updated 10 months ago
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆189Jun 5, 2025Updated last year
SalesforceAIResearch / FoFo
View on GitHub
☆27Jun 2, 2026Updated last month
neulab / data-agora
View on GitHub
[ACL 2025 Main] Official Repository for "Evaluating Language Models as Synthetic Data Generators"
☆41Dec 13, 2024Updated last year
meowpass / FollowComplexInstruction
View on GitHub
Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…
☆55Jun 24, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Lagooon / LeanSTaR
View on GitHub
☆44Sep 19, 2024Updated last year
AIFrameResearch / SPO
View on GitHub
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
☆55Sep 19, 2025Updated 10 months ago
GAIR-NLP / ToRL
View on GitHub
☆352May 24, 2025Updated last year
ByteDance-Seed / Seed-Thinking-v1.5
View on GitHub
☆811Jun 9, 2025Updated last year
YihongDong / RL-PLUS
View on GitHub
☆27Aug 31, 2025Updated 10 months ago
ibisbill / Transferability-of-LLM-Reasoning
View on GitHub
☆111Jul 6, 2026Updated 2 weeks ago
LGAI-Research / SetR
View on GitHub
☆28Sep 11, 2025Updated 10 months ago