redteaming-arena / redteam-arena
☆31Updated last month
Alternatives and similar repositories for redteam-arena:
Users that are interested in redteam-arena are comparing it to the libraries listed below
- Just a bunch of benchmark logs for different LLMs☆119Updated 8 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆89Updated last week
- Red-Teaming Language Models with DSPy☆183Updated 2 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆102Updated last year
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆64Updated 5 months ago
- Verdict is a library for scaling judge-time compute.☆199Updated last week
- look how they massacred my boy☆63Updated 6 months ago
- ☆54Updated 7 months ago
- ☆22Updated 6 months ago
- Functional Benchmarks and the Reasoning Gap☆85Updated 6 months ago
- OpenPipe ART (Agent Reinforcement Trainer): train LLM agents☆108Updated this week
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆32Updated last month
- ☆50Updated 5 months ago
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 11 months ago
- ☆128Updated 3 weeks ago
- Open source interpretability artefacts for R1.☆82Updated this week
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Updated 8 months ago
- ⚖️ Awesome LLM Judges ⚖️☆93Updated 2 months ago
- Sphynx Hallucination Induction☆53Updated 2 months ago
- ☆97Updated 6 months ago
- ☆122Updated last month
- ☆72Updated 2 months ago
- ☆17Updated 2 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆88Updated this week
- ☆33Updated 9 months ago
- ☆20Updated 5 months ago
- ☆112Updated 4 months ago
- ☆48Updated last year
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆51Updated last month
- A tree-based prefix cache library that allows rapid creation of looms: hierarchal branching pathways of LLM generations.☆68Updated 2 months ago