Jiaxin-Wen / MisleadLM

Official Code for our paper: "Language Models Learn to Mislead Humans via RLHF""
13Updated 6 months ago

Alternatives and similar repositories for MisleadLM:

Users that are interested in MisleadLM are comparing it to the libraries listed below