Tree-Shu-Zhao / ferretLinks
An extensible RL framework for training LLM agents with advanced search capabilities, built on VERL and supporting state-of-the-art search strategies.
☆21Updated 3 weeks ago
Alternatives and similar repositories for ferret
Users that are interested in ferret are comparing it to the libraries listed below
Sorting:
- ☆29Updated last month
- Resa: Transparent Reasoning Models via SAEs☆46Updated 3 months ago
- A holistic benchmark for LLM abstention☆67Updated 3 months ago
- ☆35Updated 7 months ago
- ☆52Updated 7 months ago
- Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models☆41Updated 8 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Updated 2 months ago
- ☆23Updated last year
- When Reasoning Meets Its Laws☆23Updated this week
- [ACL 2025] Knowledge Unlearning for Large Language Models☆47Updated 3 months ago
- [NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"☆26Updated 2 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated 2 weeks ago
- ☆20Updated 4 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆115Updated 6 months ago
- ☆51Updated 10 months ago
- ☆32Updated 5 months ago
- ☆22Updated 5 months ago
- ☆41Updated 6 months ago
- An automated data pipeline scaling RL to pretraining levels☆72Updated 2 months ago
- Leveraging Base Language Models for Few-Shot Synthetic Data Generation☆38Updated 2 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆31Updated 4 months ago
- The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"☆36Updated 2 months ago
- Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…☆69Updated 9 months ago
- [NeurIPS 2025 Spotlight] Official repository for "Web-Shepherd: Advancing PRMs for Reinforcing Web Agents"☆50Updated 7 months ago
- ☆24Updated 8 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆105Updated 7 months ago
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆109Updated last month
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆27Updated 7 months ago
- The code implementation of Symbolic-MoE☆45Updated 3 months ago
- Exploration of automated dataset selection approaches at large scales.☆51Updated 9 months ago