haizelabs / BEAST-implementationLinks
☆16Updated last year
Alternatives and similar repositories for BEAST-implementation
Users that are interested in BEAST-implementation are comparing it to the libraries listed below
Sorting:
- Tree of Attacks (TAP) Jailbreaking Implementation☆111Updated last year
- A utility to inspect, validate, sign and verify machine learning model files.☆57Updated 5 months ago
- ☆65Updated 5 months ago
- General research for Dreadnode☆23Updated last year
- Data Scientists Go To Jupyter☆63Updated 4 months ago
- A YAML based format for describing tools to LLMs, like man pages but for robots!☆75Updated 2 months ago
- ☆22Updated 2 years ago
- [IJCAI 2024] Imperio is an LLM-powered backdoor attack. It allows the adversary to issue language-guided instructions to control the vict…☆42Updated 5 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆92Updated 3 months ago
- CLI and API server for https://github.com/dreadnode/robopages☆34Updated this week
- Red-Teaming Language Models with DSPy☆202Updated 5 months ago
- ☆11Updated last year
- ☆16Updated last year
- Manual Prompt Injection / Red Teaming Tool☆32Updated 9 months ago
- The jailbreak-evaluation is an easy-to-use Python package for language model jailbreak evaluation.☆24Updated 8 months ago
- Code for the paper "Defeating Prompt Injections by Design"☆43Updated last month
- ☆34Updated 8 months ago
- Sphynx Hallucination Induction☆53Updated 5 months ago
- A collection of prompt injection mitigation techniques.☆23Updated last year
- An interactive CLI application for interacting with authenticated Jupyter instances.☆53Updated 2 months ago
- ☆66Updated last year
- Lightweight LLM Interaction Framework☆296Updated this week
- https://arxiv.org/abs/2412.02776☆59Updated 7 months ago
- ☆41Updated this week
- ☆121Updated last month
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Updated 11 months ago
- using ML models for red teaming☆43Updated last year
- Codebase for Obfuscated Activations Bypass LLM Latent-Space Defenses☆21Updated 5 months ago
- Small tools to assist with using Large Language Models☆11Updated last year
- Improve prompts for e.g. GPT3 and GPT-J using templates and hyperparameter optimization.☆42Updated 2 years ago