redteaming-arena / redteam-arena
☆31Updated last month
Alternatives and similar repositories for redteam-arena
Users that are interested in redteam-arena are comparing it to the libraries listed below
Sorting:
- ☆22Updated 6 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆91Updated last month
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆50Updated 6 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆102Updated last year
- ☆129Updated last month
- Open source interpretability artefacts for R1.☆109Updated 3 weeks ago
- The next evolution of Agents☆48Updated 3 weeks ago
- Sphynx Hallucination Induction☆54Updated 3 months ago
- ☆54Updated 7 months ago
- ☆48Updated last year
- Lego for GRPO☆28Updated last month
- Functional Benchmarks and the Reasoning Gap☆86Updated 7 months ago
- ☆97Updated 7 months ago
- ☆20Updated 5 months ago
- Verdict is a library for scaling judge-time compute.☆211Updated 2 weeks ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆92Updated this week
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆171Updated last week
- ⚖️ Awesome LLM Judges ⚖️☆97Updated 2 weeks ago
- ☆114Updated 4 months ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆60Updated 6 months ago
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 11 months ago
- KMD is a collection of conversational exchanges between patients and doctors on various medical topics. It aims to capture the intricaci…☆24Updated last year
- ☆22Updated this week
- Red-Teaming Language Models with DSPy☆192Updated 3 months ago
- A tree-based prefix cache library that allows rapid creation of looms: hierarchal branching pathways of LLM generations.☆68Updated 3 months ago
- ☆74Updated 3 weeks ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆32Updated 2 months ago
- ☆81Updated 4 months ago
- ☆27Updated 9 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 9 months ago