JailBench:大型语言模型越狱攻击风险评测中文数据集 [PAKDD 2025]
☆167Mar 3, 2025Updated last year
Alternatives and similar repositories for JailBench
Users that are interested in JailBench are comparing it to the libraries listed below
Sorting:
- ☆12Sep 29, 2024Updated last year
- ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]☆226Sep 29, 2024Updated last year
- "他山之石、可以攻玉":复旦白泽智能发布面向国内开源和国外商用大模型的Demo数据集JADE-DB☆496Nov 18, 2025Updated 3 months ago
- SC-Safety: 中文大模型多轮对抗安全基准☆150Mar 15, 2024Updated last year
- 【ACL 2024】 SALAD benchmark & MD-Judge☆171Mar 8, 2025Updated 11 months ago
- ☆21Jul 26, 2025Updated 7 months ago
- ☆25Nov 4, 2024Updated last year
- Tomcat的Filter型免杀内存马,主要思路是Bypass各种检查手段☆10Nov 26, 2021Updated 4 years ago
- [USENIX'25] HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns☆13Mar 1, 2025Updated last year
- Official Implementation of implicit reference attack☆11Oct 16, 2024Updated last year
- Code for Rethinking Prompt Optimizers: From Prompt Merits to Optimization☆12Jan 12, 2026Updated last month
- Open5gs for Satellite Networks, DeepWiki https://deepwiki.com/root-hbx/open5gs-satellite☆11Jan 9, 2026Updated last month
- woodpecker框架专用bcel库☆12Apr 30, 2021Updated 4 years ago
- IoM defualt mal package☆10Feb 22, 2026Updated last week
- AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM☆83Nov 3, 2024Updated last year
- enchmarking Large Language Models' Resistance to Malicious Code☆14Dec 1, 2024Updated last year
- ☆20Jan 5, 2026Updated last month
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆151Jul 19, 2024Updated last year
- Tool to get NT system shell .☆24Jul 12, 2021Updated 4 years ago
- Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。☆1,129Feb 27, 2024Updated 2 years ago
- SecProbe:任务驱动式大模型安全能力评测系统☆15Nov 29, 2024Updated last year
- [NDSS'25] The official implementation of safety misalignment.☆17Jan 8, 2025Updated last year
- ☆39May 17, 2025Updated 9 months ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]☆377Jan 23, 2025Updated last year
- ☆17Jun 30, 2023Updated 2 years ago
- MailSniper is a penetration testing tool for searching through email in a Microsoft Exchange environment for specific terms (passwords, i…☆13Nov 4, 2018Updated 7 years ago
- jasypt Decrypt Encrypt☆14Jan 7, 2022Updated 4 years ago
- Röttger et al. (2025): "MSTS: A Multimodal Safety Test Suite for Vision-Language Models"☆16Mar 31, 2025Updated 11 months ago
- 帆软bi反序列化漏洞利用工具☆190Mar 23, 2024Updated last year
- Search msDS-AllowedToActOnBehalfOfOtherIdentity☆35Jan 17, 2022Updated 4 years ago
- All thing about SDRPi☆20Mar 17, 2023Updated 2 years ago
- Synthesizing realistic and diverse text-datasets from augmented LLMs☆16Jan 26, 2026Updated last month
- 🚀 JailbreakBench 是一个用于评估大语言模型(LLM)安全性的测试工具,专注于检测模型对越狱攻击(Jailbreak)的抵抗能力。通过模拟恶意提示词注入、编码攻击和多轮对话操控,量化模型的漏洞风险,并生成详细报告与可视化分析。支持中英文数据集,适用于安全研究…☆29Sep 1, 2025Updated 6 months ago
- GUI Exploit Tool for CVE-2020-0688(Microsoft Exchange default MachineKeySection deserialize vulnerability)☆16May 9, 2024Updated last year
- Flames is a highly adversarial benchmark in Chinese for LLM's harmlessness evaluation developed by Shanghai AI Lab and Fudan NLP Group.☆63May 21, 2024Updated last year
- ☆56May 21, 2025Updated 9 months ago
- DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails☆31Feb 26, 2025Updated last year
- Submission Guide + Discussion Board for AI Singapore Global Challenge for Safe and Secure LLMs (Track 1A).☆16Jul 4, 2024Updated last year
- Configure sqlmap use proxy automatically(自动获取代理IP)☆14Aug 6, 2020Updated 5 years ago