amazon-science / PAE
☆52Updated last week
Alternatives and similar repositories for PAE:
Users that are interested in PAE are comparing it to the libraries listed below
- ☆101Updated last month
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆145Updated 11 months ago
- ☆79Updated 8 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆128Updated 4 months ago
- Natural Language Reinforcement Learning☆77Updated 2 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆129Updated 3 months ago
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆103Updated last year
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆54Updated 2 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆127Updated 4 months ago
- ☆35Updated last week
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆52Updated 3 months ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆92Updated last year
- SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …☆134Updated 11 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆118Updated 6 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆49Updated last month
- Repo of paper "Free Process Rewards without Process Labels"☆132Updated this week
- Rewarded soups official implementation☆54Updated last year
- The official implementation of Self-Exploring Language Models (SELM)☆62Updated 9 months ago
- ☆29Updated 4 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆45Updated 4 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆101Updated 11 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆76Updated last week
- Official Repo of LangSuitE☆82Updated 6 months ago
- ☆110Updated 2 weeks ago
- WONDERBREAD benchmark + dataset for BPM tasks☆24Updated 4 months ago
- The source code of the paper "Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Pla…☆85Updated 7 months ago
- ☆95Updated 8 months ago