likenneth / dialogue_action_tokenLinks
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
☆29Updated last year
Alternatives and similar repositories for dialogue_action_token
Users that are interested in dialogue_action_token are comparing it to the libraries listed below
Sorting:
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆65Updated last year
- ☆102Updated 2 years ago
- ☆53Updated 11 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆159Updated last year
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆63Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- Evaluate the Quality of Critique☆36Updated last year
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆50Updated last year
- Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".☆44Updated last year
- official implementation of paper "Process Reward Model with Q-value Rankings"☆65Updated 11 months ago
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆136Updated last year
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆124Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆43Updated last year
- RL Scaling and Test-Time Scaling (ICML'25)☆112Updated last year
- e☆43Updated 9 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆116Updated last week
- Critique-out-Loud Reward Models☆73Updated last year
- ☆117Updated last year
- Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…☆132Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Updated 2 years ago
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆116Updated 2 years ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆115Updated last year
- Directional Preference Alignment☆58Updated last year
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆83Updated last year
- Do Large Language Models Know What They Don’t Know?☆102Updated last year
- augmented LLM with self reflection☆136Updated 2 years ago
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Updated last year
- Trending projects & awesome papers about data-centric llm studies.☆39Updated 8 months ago
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆56Updated last year
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆77Updated 3 months ago