Value4AI / ValueBench

[ACL 2024] ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models

☆16

Alternatives and similar repositories for ValueBench:

Users that are interested in ValueBench are comparing it to the libraries listed below

caigaojiang / LLMOPT
the training and inference code and data for LLMOPT
☆38Updated 2 weeks ago
holarissun / RewardModelingBeyondBradleyTerry
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…
☆37Updated 3 weeks ago
ZHZisZZ / modpo
[ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
☆72Updated 7 months ago
WindyLee0822 / CTG
Source code of “Reinforcement Learning with Token-level Feedback for Controllable Text Generation (NAACL 2024)
☆11Updated 3 months ago
XueyangFeng / ReHAC
Repo of "Large Language Model-based Human-Agent Collaboration for Complex Task Solving(EMNLP2024 Findings)"
☆31Updated 6 months ago
holarissun / Prompt-OIRL
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
☆39Updated last year
xyfffff / rethink_mcts_for_tsp
[ICML'24 Oral] Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems
☆32Updated 7 months ago
cmu-mind / RISE
☆30Updated 5 months ago
OpenDFM / Rememberer
[NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents
☆35Updated 11 months ago
srzer / MOD
Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".
☆19Updated 5 months ago
xybFight / GNARKD
PyTorch code for the GNARKD.
☆22Updated last year
alexrame / rewardedsoups
Rewarded soups official implementation
☆56Updated last year
weirayao / Retroformer
☆28Updated 11 months ago
ValueCompass / Alignment-Goal-Survey
☆30Updated last year
Value4AI / gpv
[AAAI 2025] Measuring Human and AI Values Based on Generative Psychometrics with Large Language Models.
☆35Updated last month
Linear95 / DSP
Domain-specific preference (DSP) data and customized RM fine-tuning.
☆25Updated last year
waterhorse1 / Natural-language-RL
Natural Language Reinforcement Learning
☆84Updated 3 months ago
RoyalSkye / Routing-CNF
[NeurIPS 2024] "Collaboration! Towards Robust Neural Methods for Routing Problems"
☆18Updated 4 months ago
CUHK-ARISE / GAMABench
Benchmarking LLMs' Gaming Ability in Multi-Agent Environments
☆71Updated last month
ZHZisZZ / emulated-disalignment
[ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
☆34Updated 8 months ago
YangRui2015 / Generalizable-Reward-Model
Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"
☆28Updated last month
Vance0124 / Token-level-Direct-Preference-Optimization
Reference implementation for Token-level Direct Preference Optimization(TDPO)
☆130Updated last month
WooooDyy / LLM-Reverse-Curriculum-RL
Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…
☆95Updated last year
junkangwu / Dr_DPO
[ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"
☆11Updated 10 months ago
YangRui2015 / RiC
Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"
☆62Updated 3 months ago
kaist-silab / meta-sage
[ICML 2023] Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optim…
☆10Updated last year
jwhj / OREO
☆103Updated 2 months ago
liziniu / policy_optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
☆28Updated last year
MingyuJ666 / The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models
[ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correla…
☆45Updated last month
xzymustbexzy / Chain-of-Experts
Official implementation of the paper "Chain-of-Experts: When LLMs Meet Complex Operation Research Problems"
☆91Updated last month