centerforaisafety / HarmBench

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
341Updated 3 months ago

Related projects

Alternatives and complementary repositories for HarmBench