yuplin2333 / representation-space-jailbreakView on GitHub
Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794)
24Jul 26, 2024Updated last year

Alternatives and similar repositories for representation-space-jailbreak

Users that are interested in representation-space-jailbreak are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?