ajyl / dpo_toxic

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
54Updated 2 weeks ago

Related projects

Alternatives and complementary repositories for dpo_toxic