rb81 / prompt-hacking-classifierView on GitHub
A simple prompt-based approach to detecting prompt injection and jailbreaking attempts using small, self-hosted language models.
18Jun 15, 2026Updated 2 weeks ago

Alternatives and similar repositories for prompt-hacking-classifier

Users that are interested in prompt-hacking-classifier are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?