AnthenaMatrix / The-I-Exemption-Bypassing-LLM-Ethical-Filters

The "I" Exemption, is a curious behavior in some LLMs. We discover how these AI systems might shy away from directly assisting with unethical actions if you ask in the first person ("I"). But with a clever rephrase to a general scenario ("they"), they might spill the beans and explain the unethical method.
11Updated 7 months ago

Related projects

Alternatives and complementary repositories for The-I-Exemption-Bypassing-LLM-Ethical-Filters