xyq7 / GradSafe

Official Code for ACL 2024 paper "GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis"
47Updated 3 months ago

Alternatives and similar repositories for GradSafe:

Users that are interested in GradSafe are comparing it to the libraries listed below