sfasfaffa / DLPOLinks
Official Code For EMNLP2025 Findings: {DLPO : Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective}
☆10Updated last month
Alternatives and similar repositories for DLPO
Users that are interested in DLPO are comparing it to the libraries listed below
Sorting:
- [ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet☆237Updated 2 months ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆411Updated 2 months ago
- ☆213Updated 6 months ago
- ☆332Updated 8 months ago
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆290Updated 2 months ago
- A comprehensive collection of process reward models.☆135Updated 3 months ago
- SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation☆56Updated 6 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆274Updated this week
- LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey | Awesome Human-Agent Collaboration | Human-AI Collaboration☆185Updated last week
- ☆266Updated 5 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆312Updated 3 weeks ago
- ICLR 2025 Agent-Related Papers☆75Updated last year
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆153Updated 3 months ago
- 🔥🔥🔥 ICLR 2025 Oral. Automating Agentic Workflow Generation.☆418Updated last month
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆414Updated 6 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆142Updated 11 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆406Updated 3 months ago
- ☆229Updated 3 weeks ago
- A comprehensive framework for benchmarking single and multi-agent systems across a wide range of tasks—evaluating performance, accuracy, …☆35Updated 2 months ago
- [ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆351Updated 2 weeks ago
- A version of verl to support diverse tool use☆852Updated 3 weeks ago
- ☆204Updated last month
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆116Updated 5 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆148Updated 8 months ago
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆169Updated 8 months ago
- ☆198Updated last year
- ☆30Updated last year
- ☆153Updated 8 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆260Updated 8 months ago
- ☆490Updated 3 months ago