RUCAIBox / R1-SearcherLinks
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
☆566Updated 3 weeks ago
Alternatives and similar repositories for R1-Searcher
Users that are interested in R1-Searcher are comparing it to the libraries listed below
Sorting:
- ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning☆969Updated last month
- Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning☆573Updated 3 weeks ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆461Updated 2 months ago
- A series of technical report on Slow Thinking with LLM☆699Updated 2 weeks ago
- ☆242Updated last month
- ☆241Updated 2 weeks ago
- ☆540Updated 5 months ago
- Search-o1: Agentic Search-Enhanced Large Reasoning Models☆927Updated last month
- ☆717Updated 3 weeks ago
- ☆288Updated 11 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆341Updated 2 months ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆639Updated 5 months ago
- Large Reasoning Models☆804Updated 6 months ago
- ☆220Updated last month
- Agentic RAG R1 Framework via Reinforcement Learning☆215Updated last month
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data …☆717Updated 3 months ago
- Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi e…☆491Updated 3 months ago
- ☆300Updated 3 weeks ago
- Real-time updated, fine-grained reading list on LLM-synthetic-data.🔥☆262Updated 5 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆144Updated 6 months ago
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆373Updated 9 months ago
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆251Updated 2 weeks ago
- ☆782Updated last month
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆241Updated 2 months ago
- AN O1 REPLICATION FOR CODING☆335Updated 6 months ago
- ☆144Updated 5 months ago
- ☆152Updated last month
- Awesome Agent Training☆164Updated this week
- The related works and background techniques about Openai o1☆222Updated 5 months ago
- Collect every awesome work about r1!☆388Updated last month