xqlin98 / APOHFLinks
Prompt Optimization with Human Feedback
☆18Updated last year
Alternatives and similar repositories for APOHF
Users that are interested in APOHF are comparing it to the libraries listed below
Sorting:
- [NAACL 25 main] Awesome LLM Causal Reasoning is a collection of LLM-based casual reasoning works, including papers, codes and datasets.☆73Updated 6 months ago
- Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents☆40Updated 3 months ago
- ☆33Updated 3 weeks ago
- MARFT stands for Multi-Agent Reinforcement Fine-Tuning. This repository implements an LLM-based multi-agent reinforcement fine-tuning fra…☆60Updated 3 weeks ago
- ☆50Updated 5 months ago
- code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"☆131Updated last year
- code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning☆41Updated last year
- [ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"☆17Updated last year
- Reinforced Multi-LLM Agents training☆40Updated 2 months ago
- Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"☆70Updated last year
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆119Updated 5 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆47Updated 10 months ago
- ☆46Updated 10 months ago
- [ICML 2024] One Prompt is Not Enough: Automated Construction of a Mixture-of-Expert Prompts - TurningPoint AI☆26Updated 11 months ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆84Updated 9 months ago
- A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)☆181Updated 3 weeks ago
- A Sober Look at Language Model Reasoning☆81Updated 2 months ago
- ☆26Updated 4 months ago
- This is the code of MMOA-RAG.☆72Updated 3 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆60Updated 6 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆95Updated 8 months ago
- [ACL2025 Best Paper] Language Models Resist Alignment☆23Updated 2 months ago
- ☆48Updated 3 months ago
- Natural Language Reinforcement Learning☆95Updated last month
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆120Updated 6 months ago
- [NeurIPS 2024] Agent Planning with World Knowledge Model☆148Updated 8 months ago
- ☆45Updated 4 months ago
- Official Implementation of "DeLLMa: Decision Making Under Uncertainty with Large Language Models"☆64Updated 10 months ago
- ☆32Updated 10 months ago
- CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving (NAACL 2024 Findings))☆16Updated last year