ReTool-RL / ReToolLinks
☆96Updated last month
Alternatives and similar repositories for ReTool
Users that are interested in ReTool are comparing it to the libraries listed below
Sorting:
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆151Updated last month
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆207Updated 3 weeks ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆73Updated last month
- ☆198Updated last week
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆93Updated 3 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆213Updated 2 weeks ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆94Updated 2 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆102Updated 4 months ago
- ☆60Updated last week
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆79Updated 3 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆59Updated last month
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆106Updated last month
- The official code of paper “Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning”☆99Updated this week
- ☆193Updated this week
- official implementation of paper "Process Reward Model with Q-value Rankings"☆59Updated 3 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆149Updated 2 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆121Updated 2 months ago
- A version of verl to support tool use☆41Updated this week
- ☆145Updated last week
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆70Updated 2 months ago
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆140Updated 2 months ago
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆223Updated last week
- ☆173Updated this week
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆98Updated 3 weeks ago
- A Comprehensive Survey on Long Context Language Modeling☆147Updated 2 weeks ago
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆111Updated 2 months ago
- ☆173Updated 2 months ago
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆105Updated 5 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- On Memorization of Large Language Models in Logical Reasoning☆65Updated 2 months ago