haizelabs / BEAST-implementationLinks

☆16

Alternatives and similar repositories for BEAST-implementation

Users that are interested in BEAST-implementation are comparing it to the libraries listed below

Sorting:

dreadnode / parley
Tree of Attacks (TAP) Jailbreaking Implementation
☆115Updated last year
dreadnode / tensor-man
A utility to inspect, validate, sign and verify machine learning model files.
☆58Updated 6 months ago
dreadnode / example-agents
Example agents for the Dreadnode platform
☆16Updated last month
NickNameInvalid / LLM_CTF
☆65Updated 7 months ago
dreadnode / research
General research for Dreadnode
☆25Updated last year
wunderwuzzi23 / token-turbulenz
☆25Updated 2 years ago
haizelabs / dspy-redteam
Red-Teaming Language Models with DSPy
☆212Updated 6 months ago
dreadnode / robopages
A YAML based format for describing tools to LLMs, like man pages but for robots!
☆78Updated 3 months ago
haizelabs / get-haized
A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.
☆95Updated 4 months ago
bsinger98 / Incalmo
☆52Updated 2 weeks ago
moohax / Charcuterie
Data Scientists Go To Jupyter
☆65Updated 5 months ago
dreadnode / paperstack
Arxiv + Notion Sync
☆19Updated 3 months ago
PalisadeResearch / intercode
https://arxiv.org/abs/2412.02776
☆59Updated 8 months ago
lve-org / lve
A repository of Language Model Vulnerabilities and Exposures (LVEs).
☆113Updated last year
trailofbits / pajaMAS
Multi-agent system (MAS) hijacking demos
☆31Updated last month
google-research / camel-prompt-injection
Code for the paper "Defeating Prompt Injections by Design"
☆94Updated 2 months ago
dreadnode / robopages-cli
CLI and API server for https://github.com/dreadnode/robopages
☆35Updated this week
vinusankars / BEAST
Implementation of BEAST adversarial attack for language models (ICML 2024)
☆91Updated last year
haizelabs / thorn-in-haizestack
Thorn in a HaizeStack test for evaluating long-context adversarial robustness.
☆26Updated last year
5stars217 / malicious_models
using ML models for red teaming
☆44Updated 2 years ago
trailofbits / CompChomper
CompChomper is a framework for measuring how LLMs perform at code completion.
☆20Updated 4 months ago
andyzorigin / cybench
☆142Updated 2 months ago
METR / vivaria
Vivaria is METR's tool for running evaluations and conducting agent elicitation research.
☆110Updated this week
dreadnode / marque
Minimal workflows
☆20Updated last year
dreadnode / conferences
☆17Updated last year
JosephTLucas / vger
An interactive CLI application for interacting with authenticated Jupyter instances.
☆54Updated 3 months ago
haizelabs / sphynx
Sphynx Hallucination Induction
☆53Updated 7 months ago
dreadnode / rigging
Lightweight LLM Interaction Framework
☆367Updated this week
HumanCompatibleAI / tensor-trust
A prompt injection game to collect data for robust ML research
☆63Updated 7 months ago
5stars217 / offsecml
source code for the offsecml framework
☆41Updated last year