YoungDubbyDu / LLM-Agent-Optimization
This is the reading list for the survey "A Survey on the Optimization of LLM-based Agents ". We will keep adding papers and improving the list. Any suggestions and PRs are welcome!
☆24Updated 3 weeks ago
Alternatives and similar repositories for LLM-Agent-Optimization:
Users that are interested in LLM-Agent-Optimization are comparing it to the libraries listed below
- ☆59Updated this week
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆63Updated last month
- ☆54Updated 5 months ago
- ☆138Updated 2 weeks ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 4 months ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆37Updated 2 weeks ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆94Updated last year
- ☆113Updated 2 months ago
- SOTA RL fine-tuning solution for advanced math reasoning of LLM☆92Updated this week
- ☆35Updated 3 weeks ago
- ☆47Updated last month
- ☆22Updated 8 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆125Updated 3 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆104Updated last week
- This is a unified platform for implementing and evaluating test-time reasoning mechanisms in Large Language Models (LLMs).☆14Updated 2 months ago
- [ACM Computing Surveys 2025] This repository collects awesome survey, resource, and paper for Lifelong Learning with Large Language Model …☆115Updated last month
- ☆41Updated 5 months ago
- ☆216Updated this week
- The code of arxiv paper: "CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis"☆23Updated 2 months ago
- This the implementation of LeCo☆32Updated 2 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆72Updated 7 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆132Updated 5 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆167Updated 2 months ago
- [preprint] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆43Updated 3 months ago
- ☆49Updated last month
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆75Updated last week
- On Memorization of Large Language Models in Logical Reasoning☆59Updated 4 months ago
- ☆43Updated 5 months ago
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆79Updated last month
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆50Updated 2 weeks ago