StigLidu / DualDistillLinks
[EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"
☆102Updated 5 months ago
Alternatives and similar repositories for DualDistill
Users that are interested in DualDistill are comparing it to the libraries listed below
Sorting:
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆125Updated 7 months ago
- SSRL: Self-Search Reinforcement Learning☆206Updated 5 months ago
- ☆67Updated 10 months ago
- LIMI: Less is More for Agency☆160Updated 3 months ago
- accompanying material for sleep-time compute paper☆119Updated 9 months ago
- DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL☆281Updated 4 months ago
- [ICLR 2026] Efficient Agent Training for Computer Use☆135Updated 5 months ago
- Verifiers for LLM Reinforcement Learning☆80Updated 9 months ago
- [ACL 2025] Agentic Knowledgeable Self-awareness☆91Updated 7 months ago
- Process Reward Models That Think☆78Updated 2 months ago
- ☆23Updated last year
- Open-source Agentic RL for LLMs — RLAnything & DemyAgent☆223Updated last week
- ☆19Updated 11 months ago
- ☆102Updated last month
- LLM-in-Sandbox Elicits General Agentic Intelligence☆167Updated 2 weeks ago
- Data Synthesis for Deep Research Based on Semi-Structured Data☆197Updated last month
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆68Updated 9 months ago
- When Reasoning Meets Its Laws☆35Updated last month
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆358Updated 7 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆90Updated 10 months ago
- Official repo of paper LM2☆46Updated 11 months ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆96Updated 2 months ago
- Training Proactive and Personalized LLM Agents☆98Updated 2 weeks ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆143Updated last year
- Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.☆90Updated 3 months ago
- ☆131Updated last month
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆65Updated last week
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆261Updated last week
- ☆229Updated 11 months ago
- Streamline on-policy/off-policy distillation workflows in a few lines of code☆95Updated this week