PKU-Alignment / safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
1,336Updated 4 months ago

Related projects

Alternatives and complementary repositories for safe-rlhf