gao-g / preludeLinks
Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".
☆43Updated last year
Alternatives and similar repositories for prelude
Users that are interested in prelude are comparing it to the libraries listed below
Sorting:
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner☆28Updated last year
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆125Updated last year
- Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)☆48Updated 7 months ago
- ☆102Updated 2 years ago
- Offical code of the paper Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Le…☆75Updated last year
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆63Updated last year
- Evaluate the Quality of Critique☆36Updated last year
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆113Updated 2 years ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆159Updated last year
- augmented LLM with self reflection☆135Updated 2 years ago
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆32Updated 10 months ago
- ☆42Updated last year
- Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".☆163Updated last year
- official implementation of paper "Process Reward Model with Q-value Rankings"☆65Updated 10 months ago
- ☆57Updated 7 months ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆64Updated last year
- ☆49Updated 10 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆112Updated 4 months ago
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆57Updated last year
- Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…☆133Updated last year
- Models, data, and codes for the paper: MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models☆24Updated last year
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆114Updated last year
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆29Updated last year
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Updated last year
- Directional Preference Alignment☆58Updated last year
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆17Updated 11 months ago
- Code for paper Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding☆87Updated last year
- Self-Alignment with Principle-Following Reward Models☆169Updated 3 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆147Updated last year