sinanw / llm-security-prompt-injection

This project investigates the security of large language models by performing binary classification of a set of input prompts to discover malicious prompts. Several approaches have been analyzed using classical ML algorithms, a trained LLM model, and a fine-tuned LLM model.
34Updated 11 months ago

Related projects

Alternatives and complementary repositories for llm-security-prompt-injection