haizelabs / BEAST-implementation
☆16Updated 11 months ago
Alternatives and similar repositories for BEAST-implementation
Users that are interested in BEAST-implementation are comparing it to the libraries listed below
Sorting:
- Tree of Attacks (TAP) Jailbreaking Implementation☆108Updated last year
- ☆65Updated 3 months ago
- A utility to inspect, validate, sign and verify machine learning model files.☆57Updated 3 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆91Updated last month
- General research for Dreadnode☆23Updated 11 months ago
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆86Updated last year
- Sphynx Hallucination Induction☆54Updated 3 months ago
- A prompt injection game to collect data for robust ML research☆56Updated 3 months ago
- Red-Teaming Language Models with DSPy☆192Updated 3 months ago
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆50Updated 9 months ago
- ☆21Updated last year
- [IJCAI 2024] Imperio is an LLM-powered backdoor attack. It allows the adversary to issue language-guided instructions to control the vict…☆41Updated 3 months ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆109Updated last year
- ☆100Updated 2 months ago
- ☆32Updated 6 months ago
- Code to break Llama Guard☆31Updated last year
- A YAML based format for describing tools to LLMs, like man pages but for robots!☆71Updated 2 weeks ago
- Data Scientists Go To Jupyter☆63Updated 2 months ago
- Manual Prompt Injection / Red Teaming Tool☆27Updated 7 months ago
- https://arxiv.org/abs/2412.02776☆54Updated 5 months ago
- A collection of prompt injection mitigation techniques.☆22Updated last year
- Fluent student-teacher redteaming☆20Updated 9 months ago
- ☆39Updated 7 months ago
- Small tools to assist with using Large Language Models☆11Updated last year
- ☆62Updated 5 months ago
- Code for the paper "Fishing for Magikarp"☆155Updated this week
- source code for the offsecml framework☆40Updated 11 months ago
- A library for red-teaming LLM applications with LLMs.☆26Updated 7 months ago
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Updated 9 months ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆92Updated this week