microsoft / gandalf_vs_gandalf
Turning Gandalf against itself. Use LLMs to automate playing Lakera Gandalf challenge without needing to set up an account with a platform provider.
☆25Updated last year
Related projects ⓘ
Alternatives and complementary repositories for gandalf_vs_gandalf
- [Corca / ML] Automatically solved Gandalf AI with LLM☆46Updated last year
- A text embedding viewer for the Jupyter environment☆18Updated 9 months ago
- A benchmark for prompt injection detection systems.☆87Updated 2 months ago
- A writeup for the Gandalf prompt injection game.☆36Updated last year
- ☆20Updated this week
- HoneyAgents is a PoC demo of an AI-driven system that combines honeypots with autonomous AI agents to detect and mitigate cyber threats. …☆38Updated 10 months ago
- Code for the website www.jailbreakchat.com☆74Updated last year
- Lakera - ChatGPT Data Leak Protection☆23Updated 4 months ago
- Guard your LangChain applications against prompt injection with Lakera ChainGuard.☆18Updated 7 months ago
- Make your GenAI Apps Safe & Secure Test & harden your system prompt☆404Updated last month
- Red-Teaming Language Models with DSPy☆142Updated 7 months ago
- The project serves as a strategic advisory tool, capitalizing on the ZySec series of AI models to amplify the capabilities of security pr…☆40Updated 6 months ago
- ☆20Updated 2 months ago
- ☆34Updated 3 months ago
- Payloads for Attacking Large Language Models☆64Updated 4 months ago
- ATLAS tactics, techniques, and case studies data☆49Updated last month
- Explore AI Supply Chain Risk with the AI Risk Database☆50Updated 6 months ago
- ComPromptMized: Unleashing Zero-click Worms that Target GenAI-Powered Applications☆193Updated 8 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆84Updated 8 months ago
- Project LLM Verification Standard☆36Updated 7 months ago
- ⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs☆315Updated 9 months ago
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…☆313Updated 8 months ago
- source for llmsec.net☆12Updated 3 months ago
- 📚 A curated list of papers & technical articles on AI Quality & Safety☆161Updated last year
- ☆63Updated this week
- Dropbox LLM Security research code and results☆217Updated 6 months ago
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆33Updated 2 months ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆107Updated 8 months ago
- Curation of prompts that are known to be adversarial to large language models☆174Updated last year
- The fastest && easiest LLM security guardrails for AI Agents and applications.☆101Updated 2 weeks ago