xqlin98 / APOHF
Prompt Optimization with Human Feedback
☆12Updated 5 months ago
Alternatives and similar repositories for APOHF:
Users that are interested in APOHF are comparing it to the libraries listed below
- Official code for the paper: WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents☆27Updated 2 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆29Updated 3 months ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆43Updated 2 months ago
- ☆31Updated last week
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆36Updated 6 months ago
- ☆15Updated 6 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆39Updated 3 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆66Updated last month
- On Memorization of Large Language Models in Logical Reasoning☆20Updated 2 months ago
- the training and inference code and data for LLMOPT☆17Updated 3 months ago
- ☆40Updated 3 months ago
- ☆19Updated 2 months ago
- Self-Supervised Alignment with Mutual Information☆16Updated 8 months ago
- ☆14Updated 4 months ago
- [preprint] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆39Updated last month
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆55Updated last month
- code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"☆94Updated 10 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆40Updated 2 months ago
- ☆21Updated 3 months ago
- The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…☆29Updated 2 weeks ago
- Mosaic IT: Enhancing Instruction Tuning with Data Mosaics☆17Updated 6 months ago
- code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning☆34Updated 10 months ago
- ☆14Updated 3 months ago
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual C…☆13Updated 8 months ago
- A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models☆17Updated 2 months ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆24Updated last month
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆25Updated last month
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆19Updated last month
- On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability☆36Updated last week
- Automatic prompt optimization framework for multi-step agent tasks.☆27Updated 2 months ago