princeton-nlp / InstructEvalLinks

[NAACL 2024 Findings] Evaluation suite for the systematic evaluation of instruction selection methods.

☆22

Alternatives and similar repositories for InstructEval

Users that are interested in InstructEval are comparing it to the libraries listed below

Sorting:

ernie-research / Tool-Augmented-Reward-Model
[ICLR'24 spotlight] Tool-Augmented Reward Modeling
☆50Updated last month
dunzeng / MORE
Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment
☆16Updated 11 months ago
psunlpgroup / ReaLMistake
This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".
☆30Updated 10 months ago
martin-wey / CodeUltraFeedback
CodeUltraFeedback: aligning large language models to coding preferences
☆71Updated last year
Re-Align / just-eval
A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.
☆85Updated last year
cambridgeltl / PairS
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)
☆47Updated 5 months ago
hamishivi / automated-instruction-selection
Exploration of automated dataset selection approaches at large scales.
☆47Updated 4 months ago
yale-nlp / refdpo
☆16Updated 11 months ago
limenlp / safer-instruct
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Updated last year
facebookresearch / RLCD
Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment
☆69Updated last year
RUCAIBox / RLMEC
The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"
☆38Updated last year
ShiZhengyan / PowerfulPromptFT
[NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Lea…
☆74Updated last year
GAIR-NLP / MetaCritique
Evaluate the Quality of Critique
☆36Updated last year
john-hewitt / implicit-ins
Codebase for Instruction Following without Instruction Tuning
☆35Updated 9 months ago
allenai / easy-to-hard-generalization
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Updated last year
OSU-NLP-Group / llm-planning-eval
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Updated last year
Reason-Wang / NAT
[NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…
☆26Updated last year
maszhongming / ParaKnowTransfer
Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"
☆32Updated last year
csitfun / LogiCoT
the instructions and demonstrations for building a formal logical reasoning capable GLM
☆53Updated 10 months ago
awwang10 / llmpromptboosting
Accompanying code for "Boosted Prompt Ensembles for Large Language Models"
☆30Updated 2 years ago
tml-epfl / icl-alignment
Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]
☆30Updated 5 months ago
princeton-nlp / Collie
[ICLR 2024] COLLIE: Systematic Construction of Constrained Text Generation Tasks
☆51Updated last year
xxxiaol / QRData
Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data
☆41Updated 4 months ago
Ablustrund / APPS_Plus
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
☆67Updated 10 months ago
sher222 / LeReT
Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
☆39Updated 8 months ago
eric11eca / disco
Generating diverse counterfactual data for Natural Language Understanding tasks using Large Language Models (LLMs). The generator support…
☆37Updated last year
THUNLP-MT / PGRA
Prompt-Guided Retrieval For Non-Knowledge-Intensive Tasks
☆12Updated last year
UIC-Liu-Lab / CPT
[EMNLP 2022] Continual Training of Language Models for Few-Shot Learning
☆45Updated 2 years ago
yidingjiang / ado
The repository contains code for Adaptive Data Optimization
☆25Updated 7 months ago
austrian-code-wizard / c3po
☆27Updated 2 weeks ago