amazon-science / TurboFuzzLLMLinks
TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in Practice
☆22Updated 2 months ago
Alternatives and similar repositories for TurboFuzzLLM
Users that are interested in TurboFuzzLLM are comparing it to the libraries listed below
Sorting:
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆112Updated last year
- LLM security and privacy☆54Updated last year
- A utility to inspect, validate, sign and verify machine learning model files.☆65Updated last year
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆91Updated last year
- General research for Dreadnode☆27Updated last year
- ☆81Updated 3 months ago
- Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)☆39Updated this week
- ☆23Updated 2 years ago
- ☆190Updated last month
- Papers about red teaming LLMs and Multimodal models.☆159Updated 8 months ago
- TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…☆79Updated 5 months ago
- A benchmark for prompt injection detection systems.☆158Updated last month
- A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.☆103Updated last year
- Code for the paper "Defeating Prompt Injections by Design"☆246Updated 7 months ago
- Cyber-Zero: Training Cybersecurity Agents Without Runtime☆69Updated last week
- A Python library for guardrail models evaluation.☆30Updated 3 months ago
- SECURE: Benchmarking Generative Large Language Models as a Cyber Advisory☆15Updated last year
- ☆34Updated last year
- This repository provides a benchmark for prompt injection attacks and defenses in LLMs☆384Updated 3 months ago
- An environment simulation for networks security tasks for development and testing AI based agents. Part of AI Dojo project☆57Updated 2 weeks ago
- Tree of Attacks (TAP) Jailbreaking Implementation☆117Updated 2 years ago
- LLM proxy to observe and debug what your AI agents are doing.☆64Updated 3 months ago
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆54Updated last year
- Code snippets to reproduce MCP tool poisoning attacks.☆192Updated 9 months ago
- Code Repository for: AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models☆92Updated this week
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆425Updated this week
- Risks and targets for assessing LLMs & LLM vulnerabilities☆33Updated last year
- PhD/MSc course on Machine Learning Security (Univ. Cagliari)☆226Updated last month
- [ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability☆176Updated last year
- Honest-but-Curious Nets: Sensitive Attributes of Private Inputs Can Be Secretly Coded into the Classifiers' Outputs (ACM CCS'21)☆17Updated 3 years ago