ZHZisZZ / emulated-disalignment
View external linksLinks

[ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
39Aug 2, 2024Updated last year

Alternatives and similar repositories for emulated-disalignment

Users that are interested in emulated-disalignment are comparing it to the libraries listed below

Sorting:

Are these results useful?