ritaranx / AceSearcherLinks
This is the code repo for the paper AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play (NeurIPS 2025 Spotlight).
☆24Updated 4 months ago
Alternatives and similar repositories for AceSearcher
Users that are interested in AceSearcher are comparing it to the libraries listed below
Sorting:
- ☆36Updated last year
- [EMNLP 2024] This is the code for our paper "BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers".☆23Updated last year
- [ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆48Updated last month
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆18Updated last year
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Updated this week
- ☆43Updated 5 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆52Updated last year
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆29Updated 8 months ago
- [NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"☆26Updated 2 months ago
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆58Updated 2 weeks ago
- ☆29Updated 10 months ago
- ☆48Updated last month
- RL with Experience Replay☆55Updated 6 months ago
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆75Updated 6 months ago
- DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue☆64Updated 2 weeks ago
- [AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data☆33Updated 10 months ago
- ☆17Updated 2 months ago
- Code for Heima☆59Updated 9 months ago
- [ACL 2025 Main] (🏆 Outstanding Paper Award) Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Proba…☆15Updated 5 months ago
- [NeurIPS25] RULE: Reinforcement UnLEarning Achieves Forge-retain Pareto Optimality☆19Updated 3 months ago
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆29Updated 4 months ago
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆47Updated 3 months ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated last year
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆27Updated 8 months ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆24Updated 4 months ago
- Official Repository of LatentSeek☆76Updated 8 months ago
- ☆25Updated 10 months ago
- ☆59Updated 3 weeks ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Updated this week
- ☆47Updated 10 months ago