safellama / plexiglass
A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
☆116Updated 8 months ago
Related projects: ⓘ
- Red-Teaming Language Models with DSPy☆116Updated 5 months ago
- Payloads for Attacking Large Language Models☆56Updated 2 months ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆103Updated 6 months ago
- Fiddler Auditor is a tool to evaluate language models.☆163Updated 6 months ago
- ⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs☆299Updated 7 months ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]☆181Updated last month
- Dropbox LLM Security research code and results☆210Updated 3 months ago
- [Corca / ML] Automatically solved Gandalf AI with LLM☆46Updated last year
- A benchmark for prompt injection detection systems.☆80Updated last week
- Risks and targets for assessing LLMs & LLM vulnerabilities☆24Updated 3 months ago
- BlindBox is a tool to isolate and deploy applications inside Trusted Execution Environments for privacy-by-design apps☆57Updated 10 months ago
- A JupyterLab extension to evaluate the security of your Jupyter environment☆36Updated last year
- HoneyAgents is a PoC demo of an AI-driven system that combines honeypots with autonomous AI agents to detect and mitigate cyber threats. …☆34Updated 8 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆77Updated 3 months ago
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…☆293Updated 6 months ago
- Lightweight LLM Interaction Framework☆181Updated this week
- ComPromptMized: Unleashing Zero-click Worms that Target GenAI-Powered Applications☆189Updated 6 months ago
- A trace analysis tool for AI agents.☆97Updated this week
- Curation of prompts that are known to be adversarial to large language models☆170Updated last year
- PhD/MSc course on Machine Learning Security (Univ. Cagliari)☆190Updated this week
- LLM security and privacy☆38Updated 5 months ago
- Adversarial Attacks on GPT-4 via Simple Random Search [Dec 2023]☆41Updated 4 months ago
- Protection against Model Serialization Attacks☆273Updated this week
- ☆15Updated 4 months ago
- source for llmsec.net☆11Updated last month
- A collection of awesome resources related AI security☆107Updated 5 months ago
- A curated list of MLSecOps tools, articles and other resources on security applied to Machine Learning and MLOps systems.☆220Updated last month
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆72Updated 4 months ago
- A collection of prompt injection mitigation techniques.☆15Updated last year
- Explore AI Supply Chain Risk with the AI Risk Database☆44Updated 4 months ago