ZHZisZZ / emulated-disalignment

[ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
29Updated 3 months ago

Related projects

Alternatives and complementary repositories for emulated-disalignment