GX-XinGao / GRALinks
The Code and Script of "David's Slingshot: A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis"
☆34Updated 6 months ago
Alternatives and similar repositories for GRA
Users that are interested in GRA are comparing it to the libraries listed below
Sorting:
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆33Updated last year
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆43Updated 9 months ago
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆11Updated 7 months ago
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆63Updated 5 months ago
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆23Updated 2 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆31Updated 4 months ago
- ☆94Updated last year
- ☆46Updated 2 months ago
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆103Updated 2 months ago
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆67Updated 8 months ago
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models☆39Updated last year
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆89Updated last year
- ☆95Updated last year
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆49Updated this week
- ☆13Updated 10 months ago
- ☆92Updated 6 months ago
- ☆46Updated 6 months ago
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆27Updated 2 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Updated 2 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆80Updated 2 months ago
- ☆23Updated last year
- ☆98Updated 4 months ago
- Codebase for Instruction Following without Instruction Tuning☆36Updated last year
- ☆50Updated 6 months ago
- ☆19Updated 9 months ago
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)☆33Updated 3 months ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Updated 2 months ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆53Updated last year
- [NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning☆92Updated 2 months ago
- ☆60Updated last year