corca-ai / LLMFuzzAgent
[Corca / ML] Automatically solved Gandalf AI with LLM
☆46Updated last year
Related projects: ⓘ
- A benchmark for prompt injection detection systems.☆80Updated last week
- Red-Teaming Language Models with DSPy☆116Updated 5 months ago
- Turning Gandalf against itself. Use LLMs to automate playing Lakera Gandalf challenge without needing to set up an account with a platfor…☆24Updated 11 months ago
- Uses the ChatGPT model to determine if a user-supplied question is safe and filter out dangerous questions☆42Updated last year
- Payloads for Attacking Large Language Models☆56Updated 2 months ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆103Updated 6 months ago
- ☆34Updated this week
- Approximation of the Claude 3 tokenizer by inspecting generation stream☆109Updated last month
- Curation of prompts that are known to be adversarial to large language models☆170Updated last year
- Masked Python SDK wrapper for OpenAI API. Use public LLM APIs securely.☆110Updated last year
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆77Updated 3 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆96Updated 10 months ago
- A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).☆116Updated 8 months ago
- Dropbox LLM Security research code and results☆210Updated 3 months ago
- Framework for LLM evaluation, guardrails and security☆94Updated last week
- Fiddler Auditor is a tool to evaluate language models.☆163Updated 6 months ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]☆181Updated last month
- A trace analysis tool for AI agents.☆97Updated this week
- A guide to LLM hacking: fundamentals, prompt injection, offense, and defense☆112Updated last year
- Just a bunch of benchmark logs for different LLMs☆112Updated last month
- 🧠 LLMFuzzer - Fuzzing Framework for Large Language Models 🧠 LLMFuzzer is the first open-source fuzzing framework specifically designed …☆218Updated 7 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆93Updated 5 months ago
- HoneyAgents is a PoC demo of an AI-driven system that combines honeypots with autonomous AI agents to detect and mitigate cyber threats. …☆34Updated 8 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆81Updated 6 months ago
- Every practical and proposed defense against prompt injection.☆310Updated 3 months ago
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System☆86Updated 3 months ago
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act☆92Updated 11 months ago
- Track the progress of LLM context utilisation☆53Updated 2 months ago
- CodeSage: Code Representation Learning At Scale (ICLR 2024)☆76Updated 2 months ago
- Attribute (or cite) statements generated by LLMs back to in-context information.☆107Updated 2 weeks ago