sjtu-sai-agents / Browse-MasterLinks
Official implementation of Browse-Master, a tool-augmented web-search agent.
☆19Updated last month
Alternatives and similar repositories for Browse-Master
Users that are interested in Browse-Master are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆45Updated last week
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆138Updated 3 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆39Updated 3 weeks ago
- Tree Search for LLM Agent Reinforcement Learning☆113Updated last week
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆59Updated 11 months ago
- Large Language Models Can Self-Improve in Long-context Reasoning☆73Updated 10 months ago
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆92Updated 2 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆46Updated 7 months ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆73Updated 10 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 8 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 4 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆33Updated last year
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]☆32Updated last month
- ☆22Updated last year
- EMNLP MAIN 2025 StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization☆39Updated 3 weeks ago
- Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆47Updated 4 months ago
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆111Updated last month
- SSRL: Self-Search Reinforcement Learning☆145Updated last month
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆47Updated 2 months ago
- ☆41Updated 10 months ago
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆26Updated 3 weeks ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆40Updated 7 months ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Updated last year
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆88Updated 6 months ago
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆42Updated last week
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆25Updated 4 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆47Updated 2 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 11 months ago
- ☆45Updated last week
- [EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time☆83Updated 4 months ago