LLM-Tuning-Safety / LLMs-Finetuning-Safety

We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
266Updated 11 months ago

Alternatives and similar repositories for LLMs-Finetuning-Safety:

Users that are interested in LLMs-Finetuning-Safety are comparing it to the libraries listed below