RUC-NLPIR / WebThinker
π WebThinker: Empowering Large Reasoning Models with Deep Research Capability
β147Updated 2 weeks ago
Alternatives and similar repositories for WebThinker:
Users that are interested in WebThinker are comparing it to the libraries listed below
- The demo, code and data of FollowRAGβ71Updated this week
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reasoβ¦β101Updated last month
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenariosβ65Updated 4 months ago
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"β55Updated 4 months ago
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)β80Updated 2 months ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.β282Updated last week
- β143Updated 9 months ago
- β140Updated last year
- CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generationβ47Updated 2 months ago
- β130Updated 3 months ago
- β55Updated 6 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuningβ147Updated 7 months ago
- β97Updated last year
- [ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialoguesβ83Updated 9 months ago
- Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs (ACL 2024)β63Updated 2 weeks ago
- β81Updated last year
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learningβ477Updated this week
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Modelsβ41Updated last year
- A curated list of awesome works in Routing LLMs paradigm (π Welcome to submit your contributions to this code repository)β30Updated last month
- β157Updated 3 weeks ago
- Code implementation of synthetic continued pretrainingβ104Updated 3 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QAβ123Updated 5 months ago
- β46Updated 10 months ago
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Modelsβ182Updated 6 months ago
- Codes for our paper "RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation"β173Updated 8 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuningβ133Updated 4 months ago
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.β42Updated 10 months ago
- Reformatted Alignmentβ115Updated 7 months ago
- β51Updated 7 months ago
- The code and data of DPA-RAGβ58Updated 3 months ago