Singla17 / dynamic-alignment-optimizationLinks

[EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-based optimization framework that allows LLMs to iteratively self-improve and design the best alignment instructions without the need for additional training.

☆24

Alternatives and similar repositories for dynamic-alignment-optimization

Users that are interested in dynamic-alignment-optimization are comparing it to the libraries listed below

Sorting:

csitfun / LogiCoT
the instructions and demonstrations for building a formal logical reasoning capable GLM
☆53Updated 10 months ago
ernie-research / Tool-Augmented-Reward-Model
[ICLR'24 spotlight] Tool-Augmented Reward Modeling
☆51Updated last month
GAIR-NLP / MetaCritique
Evaluate the Quality of Critique
☆36Updated last year
OSU-NLP-Group / llm-planning-eval
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Updated last year
TianHongZXY / CoRe
[ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models (LLMs + MCTS + Self-Improvement)
☆49Updated last year
ChengpengLi1003 / DotaMath
☆30Updated 6 months ago
DAMO-NLP-SG / contrastive-cot
Contrastive Chain-of-Thought Prompting
☆64Updated last year
Strong-AI-Lab / Logical-and-abstract-reasoning
Evaluation on Logical Reasoning and Abstract Reasoning Challenges
☆28Updated 2 months ago
MetaCopilot / dseval
☆25Updated last year
lifan-yuan / CRAFT
Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"
☆57Updated last year
jiangjiechen / auction-arena
Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…
☆45Updated last year
qiancheng0 / CREATOR
This is the repository for paper "CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models"
☆25Updated last year
Reason-Wang / NAT
[NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…
☆26Updated last year
technion-cs-nlp / hallucination-mitigation
☆22Updated 7 months ago
FranxYao / Complexity-Based-Prompting
Complexity Based Prompting for Multi-Step Reasoning
☆17Updated 2 years ago
HowieHwong / MetaTool
[ICLR 2024] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
☆89Updated last year
yikee / Knowledge_Conflict
Resolving Knowledge Conflicts in Large Language Models, COLM 2024
☆17Updated last month
Open-Source-O1 / o1_Reasoning_Patterns_Study
☆102Updated 7 months ago
WeiminXiong / IPR
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)
☆60Updated 9 months ago
Di-viner / LLM-Robustness-to-Irrelevant-Information
[COLM'24] "How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?"
☆22Updated 9 months ago
BunsenFeng / AbstainQA
AbstainQA, ACL 2024
☆27Updated 9 months ago
wzhouad / context-faithful-llm
Code and data for paper "Context-faithful Prompting for Large Language Models".
☆40Updated 2 years ago
zjunlp / TRICE
[NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback
☆41Updated last year
wwxu21 / CUT
Source code of "Reasons to Reject? Aligning Language Models with Judgments"
☆58Updated last year
qtli / GSM-Plus
GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.
☆62Updated last year
OpenLMLab / LongWanjuan
Towards Systematic Measurement for Long Text Quality
☆36Updated 10 months ago
koalazf99 / Awesome-DataCentric-LLM
Trending projects & awesome papers about data-centric llm studies.
☆37Updated last month
NanshineLoong / Self-Evolving-Benchmark
A framework for evolving and testing question-answering datasets with various models.
☆16Updated last year
icip-cas / Verifier-Engineering
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆61Updated 7 months ago
princeton-nlp / LLMBar
[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following
☆127Updated last year