GX-XinGao / GRALinks
The Code and Script of "David's Slingshot: A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis"
☆31Updated last month
Alternatives and similar repositories for GRA
Users that are interested in GRA are comparing it to the libraries listed below
Sorting:
- ☆88Updated 8 months ago
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆47Updated 2 weeks ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆27Updated 4 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆38Updated 4 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆99Updated 2 months ago
- ☆22Updated 7 months ago
- ☆10Updated 3 months ago
- Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆93Updated last week
- ☆18Updated 6 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆144Updated 2 weeks ago
- ☆50Updated last month
- ☆50Updated last month
- ☆32Updated 3 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆96Updated last month
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆60Updated 2 months ago
- PGRAG☆52Updated last year
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆14Updated 5 months ago
- [ACL 2025] Agentic Knowledgeable Self-awareness☆75Updated last month
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆116Updated last year
- ☆47Updated 5 months ago
- ☆22Updated last year
- ☆90Updated 2 months ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆75Updated 9 months ago
- ☆45Updated 2 months ago
- A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…☆19Updated 8 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆74Updated 3 months ago
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models☆38Updated 9 months ago
- Efficient Agent Training for Computer Use☆116Updated last month
- ☆63Updated 9 months ago
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning☆63Updated 4 months ago