nwinter / ultimate-jailbreaking-championshipLinks

Playing around with various jailbreaking techniques ahead of the Gray Swan AI Ultimate Jailbreaking Competition

☆14

Alternatives and similar repositories for ultimate-jailbreaking-championship

Users that are interested in ultimate-jailbreaking-championship are comparing it to the libraries listed below

Sorting:

iSE-UET-VNU / RAMBO
Official implementation of our paper: "RAMBO: Enhancing RAG-based Repository-Level Method Body Completion"
☆14Updated 5 months ago
thinh-dao / AI-for-TicTacToe-and-Gomoku-
☆11Updated last year
Y-L-LIU / MGTBench-2.0
☆26Updated 6 months ago
liu00222 / Open-Prompt-Injection
This repository provides a benchmark for prompt injection attacks and defenses
☆301Updated this week
GraySwanAI / nanoGCG
A fast + lightweight implementation of the GCG algorithm in PyTorch
☆289Updated 5 months ago
IBM / raven-large-language-models
Code for I-RAVEN-X generation and experiments
☆17Updated last month
ThuCCSLab / JailbreakEval
[NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.
☆171Updated 6 months ago
gnarcoding / firewall_try_harder
☆12Updated 9 months ago
AI-safety-book / AI-safety-book.github.io
☆17Updated 8 months ago
Libr-AI / OpenRedTeaming
Papers about red teaming LLMs and Multimodal models.
☆144Updated 4 months ago
ethz-spylab / agentdojo
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
☆318Updated this week
SinMaven / Network-Security
A curated collection of courses, videos, and resources to master network security from the ground up.
☆10Updated 9 months ago
JailbreakBench / artifacts
Jailbreak artifacts for JailbreakBench
☆70Updated 11 months ago
aypan17 / latentqa
☆21Updated 6 months ago
centerforaisafety / HarmBench
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
☆752Updated last year
BHui97 / PLeak
☆64Updated 10 months ago
jiah-li / magic
The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.
☆11Updated 10 months ago
dtch1997 / steering-bench
Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"
☆15Updated 10 months ago
patrickrchao / JailbreakingLLMs
☆634Updated 3 months ago
uiuc-kang-lab / InjecAgent
☆81Updated last year
RICommunity / TAP
TAP: An automated jailbreaking method for black-box LLMs
☆188Updated 10 months ago
JailbreakBench / jailbreakbench
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
☆430Updated 6 months ago
hgabor / nestjs-keret-2024
NestJS project template, configured with prisma and ejs
☆12Updated 10 months ago
tml-epfl / llm-adaptive-attacks
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
☆353Updated 8 months ago
shaheennabi / Production-Ready-LeafLogic-Multi-AI-Agents-Project
🍃 Production-ready: Just upload a photo of any plant or crop, the system takes care of the rest. Powered by advanced object detection an…
☆13Updated 6 months ago
UKGovernmentBEIS / as-evaluation-standard
A repository that holds templates, examples, and tests to help external parties submit tasks to AISI that conform with the Autonomous Sys…
☆11Updated 8 months ago
Svigo-o / HUST_CyberSecurity_MasterExam_File
☆17Updated 9 months ago
tezansahu / ai-garage
Mini-Projects using Cutting-Edge AI Frameworks
☆14Updated last month
AI-secure / RedCode
[NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents
☆50Updated 3 months ago
datasci-w266 / 2025-spring-main
☆10Updated 6 months ago