EIT-NLP / AccuracyParadox-RLHF

[EMNLP 2024 Main] Official implementation of the paper "The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Language Models". (by Yanjun Chen)
14Updated 4 months ago

Alternatives and similar repositories for AccuracyParadox-RLHF:

Users that are interested in AccuracyParadox-RLHF are comparing it to the libraries listed below