haizelabs / get-haized
A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.
☆89Updated 9 months ago
Alternatives and similar repositories for get-haized:
Users that are interested in get-haized are comparing it to the libraries listed below
- Red-Teaming Language Models with DSPy☆175Updated last month
- Sphynx Hallucination Induction☆53Updated 2 months ago
- Verdict is a library for scaling judge-time compute.☆190Updated 2 weeks ago
- ☆48Updated last year
- ☆22Updated 5 months ago
- ☆31Updated last week
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆31Updated last month
- ☆47Updated 11 months ago
- ⚖️ Awesome LLM Judges ⚖️☆87Updated last month
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆85Updated this week
- A better way of testing, inspecting, and analyzing AI Agent traces.☆30Updated this week
- they've simulated websites, worlds, and imaginary CLIs... but what if they simulated *you*?☆116Updated 2 weeks ago
- ☆112Updated 2 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 5 months ago
- A framework-less approach to robust agent development.☆156Updated last week
- ☆89Updated 6 months ago
- papers.day☆93Updated last year
- Releases from OpenAI Preparedness☆276Updated this week
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Updated 8 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- Just a bunch of benchmark logs for different LLMs☆119Updated 8 months ago
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 10 months ago
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆59Updated 10 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆98Updated last year
- Use the OpenAI Batch tool to make async batch requests to the OpenAI API.☆96Updated last year
- ☆124Updated last week
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆230Updated last month
- Functional Benchmarks and the Reasoning Gap☆84Updated 6 months ago
- ☆120Updated 3 weeks ago
- ☆50Updated last year