Berkeley-NLP / Agent-Eval-Refine
Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]
☆134Updated 4 months ago
Alternatives and similar repositories for Agent-Eval-Refine:
Users that are interested in Agent-Eval-Refine are comparing it to the libraries listed below
- ☆81Updated this week
- ☆157Updated 3 weeks ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆135Updated 5 months ago
- ☆107Updated 3 months ago
- "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆68Updated 2 weeks ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆84Updated last month
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆79Updated 3 weeks ago
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆47Updated 2 months ago
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆182Updated this week
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆62Updated 11 months ago
- Natural Language Reinforcement Learning☆87Updated 4 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆182Updated last week
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆162Updated last week
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆132Updated 7 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆74Updated 10 months ago
- ☆57Updated last month
- augmented LLM with self reflection☆119Updated last year
- ☆101Updated 2 weeks ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 3 months ago
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆107Updated 3 weeks ago
- ☆90Updated 9 months ago
- Self-Alignment with Principle-Following Reward Models☆160Updated last year
- ☆121Updated 10 months ago
- ☆142Updated 11 months ago
- ☆125Updated 3 weeks ago
- TTRL: Test-Time Reinforcement Learning☆166Updated this week
- Repository for the paper Stream of Search: Learning to Search in Language☆145Updated 2 months ago
- MPO: Boosting LLM Agents with Meta Plan Optimization☆50Updated last month
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆94Updated last week