neelsjain / baseline-defenses

Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"
20Updated last year

Related projects

Alternatives and complementary repositories for baseline-defenses