elated-sawyer / WALL-E
Official code for the paper: WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents
☆17Updated last month
Related projects ⓘ
Alternatives and complementary repositories for WALL-E
- ☆22Updated 2 months ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆72Updated 9 months ago
- Official repository for paper "GTA: A Benchmark for General Tool Agents" (NeurIPS 2024 D&B Track)☆43Updated last week
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆96Updated last week
- ☆21Updated 5 months ago
- ☆41Updated 2 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆94Updated 3 weeks ago
- Benchmarking LLMs' Gaming Ability in Multi-Agent Environments☆39Updated last month
- ☆15Updated last month
- ☆15Updated 3 months ago
- Directional Preference Alignment☆49Updated last month
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆33Updated 3 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆52Updated 2 months ago
- ☆74Updated 5 months ago
- code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning☆32Updated 7 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆61Updated 3 weeks ago
- The code for "Can Large Language Model Agents Simulate Human Trust Behaviors?"☆38Updated last week
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆92Updated last year
- ☆24Updated 6 months ago
- ☆73Updated 4 months ago
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆49Updated 2 months ago
- ☆27Updated last week
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆95Updated 2 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆105Updated 7 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆38Updated last month
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆39Updated 3 months ago
- Implementation of the MATRIX framework (ICML 2024)☆39Updated 6 months ago
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆11Updated this week
- Reasoning with Language Model is Planning with World Model☆145Updated last year
- [ACL 2024] The project of Symbol-LLM☆41Updated 4 months ago