liziniu / policy_optimization

Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
23Updated 11 months ago

Related projects

Alternatives and complementary repositories for policy_optimization