redteaming-arena / redteam-arena
☆31Updated last week
Alternatives and similar repositories for redteam-arena:
Users that are interested in redteam-arena are comparing it to the libraries listed below
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆89Updated 9 months ago
- ☆22Updated 5 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆31Updated last month
- look how they massacred my boy☆63Updated 5 months ago
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆59Updated 10 months ago
- A strongly typed Python DSL for developing message passing multi agent systems☆52Updated 11 months ago
- ☆38Updated 8 months ago
- ☆20Updated 5 months ago
- ☆66Updated 10 months ago
- ☆97Updated 5 months ago
- ☆53Updated 6 months ago
- Sphynx Hallucination Induction☆53Updated 2 months ago
- ☆109Updated 2 weeks ago
- Verdict is a library for scaling judge-time compute.☆192Updated 2 weeks ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated 10 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆63Updated 5 months ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆87Updated this week
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆50Updated 4 months ago
- ☆48Updated last year
- Red-Teaming Language Models with DSPy☆178Updated last month
- Lego for GRPO☆26Updated this week
- ☆111Updated 3 months ago
- Prompt leak technique for Bing Chat☆31Updated last year
- Verbosity control for AI agents☆60Updated 10 months ago
- ☆125Updated this week
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆169Updated this week
- ☆48Updated 4 months ago
- ☆80Updated 2 months ago
- Functional Benchmarks and the Reasoning Gap☆84Updated 6 months ago
- ☆124Updated 3 weeks ago