GX-XinGao / GRALinks
The Code and Script of "David's Slingshot: A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis"
☆34Updated 4 months ago
Alternatives and similar repositories for GRA
Users that are interested in GRA are comparing it to the libraries listed below
Sorting:
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆11Updated 6 months ago
- ☆45Updated 3 weeks ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆42Updated 8 months ago
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆66Updated 5 months ago
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆27Updated last month
- SSRL: Self-Search Reinforcement Learning☆148Updated 2 months ago
- ☆41Updated 2 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆64Updated 5 months ago
- ☆43Updated 3 weeks ago
- ☆38Updated 2 months ago
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆60Updated 3 months ago
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆78Updated last month
- ☆31Updated 3 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆77Updated last month
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆96Updated 3 weeks ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆31Updated 2 months ago
- ☆92Updated 11 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆33Updated last year
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning https://arxiv.org/abs/2501.06590☆72Updated 3 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 4 months ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆10Updated last month
- ☆39Updated 3 months ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆52Updated 11 months ago
- ☆59Updated last year
- Efficient Agent Training for Computer Use☆131Updated last month
- The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Updated last month
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆19Updated 8 months ago
- [ICML2025] The official implementation of "C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Gene…☆40Updated 5 months ago
- [NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning☆72Updated last month
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]☆178Updated 3 months ago