dengmengjie / ToolScopeLinks
Official repository for ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use
☆27Updated 2 months ago
Alternatives and similar repositories for ToolScope
Users that are interested in ToolScope are comparing it to the libraries listed below
Sorting:
- The demo, code and data of FollowRAG☆75Updated 7 months ago
- This is the code repo for the paper "Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning".☆33Updated 5 months ago
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆63Updated 3 months ago
- Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedbac…☆26Updated last week
- Scaling Long-Horizon LLM Agent via Context-Folding☆101Updated this week
- [COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?☆36Updated 7 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆71Updated 8 months ago
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"☆80Updated last year
- ☆177Updated last month
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆57Updated this week
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆312Updated 3 weeks ago
- HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches☆35Updated 3 months ago
- The official implementation of "EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis".☆78Updated last week
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆290Updated 2 months ago
- [NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆98Updated last month
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Updated last week
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆52Updated 5 months ago
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆150Updated 4 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆104Updated 4 months ago
- [ICLR 2025] This is the code repo for our ICLR’25 paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rew…☆50Updated 11 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆146Updated last month
- REverse-Engineered Reasoning for Open-Ended Generation☆89Updated 4 months ago
- [AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆95Updated 2 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆182Updated 6 months ago
- 🔍 Awesome Agentic Search is a curated list of papers, tools, and resources on agentic search—where AI agents plan, search, and reason to…☆52Updated 5 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆131Updated 9 months ago
- [NeurIPS 2025 D&B (Spotlight🌟)] TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario☆29Updated 3 months ago
- [FSE'2026] SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆138Updated this week
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆97Updated 11 months ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆53Updated last year