Tree-Shu-Zhao / ferretLinks
An extensible RL framework for training LLM agents with advanced search capabilities, built on VERL and supporting state-of-the-art search strategies.
☆18Updated this week
Alternatives and similar repositories for ferret
Users that are interested in ferret are comparing it to the libraries listed below
Sorting:
- A holistic benchmark for LLM abstention☆61Updated 3 months ago
- ☆52Updated 6 months ago
- ☆29Updated 3 weeks ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Updated last month
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated 2 months ago
- [ACL 2025] Knowledge Unlearning for Large Language Models☆46Updated 2 months ago
- ☆22Updated 4 months ago
- Resa: Transparent Reasoning Models via SAEs☆44Updated 2 months ago
- ☆46Updated 2 months ago
- ☆23Updated 11 months ago
- ☆35Updated 6 months ago
- [NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"☆25Updated last month
- ☆25Updated 7 months ago
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆53Updated 2 months ago
- [NeurIPS 2025 Spotlight] Official repository for "Web-Shepherd: Advancing PRMs for Reinforcing Web Agents"☆49Updated 6 months ago
- ☆20Updated 4 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆16Updated last month
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆123Updated last year
- Official Code Release for "Training a Generally Curious Agent"☆39Updated 6 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆35Updated 9 months ago
- ☆50Updated last month
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆115Updated 5 months ago
- Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…☆68Updated 9 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆106Updated 6 months ago
- An automated data pipeline scaling RL to pretraining levels☆71Updated last month
- Exploration of automated dataset selection approaches at large scales.☆50Updated 9 months ago
- ☆24Updated last year
- Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models☆42Updated 7 months ago
- SSRL: Self-Search Reinforcement Learning☆157Updated 3 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆70Updated 6 months ago