leileqiTHU/Attacker

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/leileqiTHU/Attacker)

leileqiTHU / Attacker

The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1

☆13

Alternatives and similar repositories for Attacker

Users that are interested in Attacker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fiveai / understanding_safety_finetuning
View on GitHub
Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)
☆12Oct 31, 2024Updated last year
yiksiu-chan / SpeakEasy
View on GitHub
[ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions
☆14Mar 7, 2026Updated 4 months ago
thu-coai / AISafetyLab
View on GitHub
AISafetyLab: A comprehensive framework covering safety attack, defense, evaluation and paper list.
☆248Apr 21, 2026Updated 3 months ago
pkulcwmzx / knowledge-boundary
View on GitHub
[ACL 2024] Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation
☆10May 26, 2024Updated 2 years ago
YiyiyiZhao / siren
View on GitHub
Welcome to the official repository for Siren, a project aimed at understanding and mitigating harmful behaviors in large language models …
☆15Jun 14, 2026Updated last month
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Jinxiaolong1129 / Foot-in-the-door-Jailbreak
View on GitHub
☆23May 14, 2025Updated last year
Karbo123 / recon
View on GitHub
an universal pytorch deep learning experiment codebase
☆11Mar 31, 2025Updated last year
finitearth / capo
View on GitHub
We introduce CAPO, a novel prompt optimization algorithm that integrates racing and multi-objective optimization for cost-efficiency and …
☆15Aug 3, 2025Updated 11 months ago
SALT-NLP / search_privacy_risk
View on GitHub
Code for the paper "Searching Privacy Risks in Multi-Agent Systems via Simulation"
☆24Oct 13, 2025Updated 9 months ago
yuki-younai / Jailbreak-R1
View on GitHub
offical implementation of Jailbreak-R1
☆15Jul 16, 2025Updated last year
AntigoneRandy / QCB-LACPDA
View on GitHub
Official Implementation for "Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approxim…
☆12Aug 14, 2024Updated last year
AI45Lab / X-Boundary
View on GitHub
[EMNLP 2025] The code repo of paper "X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Com…
☆41Nov 24, 2025Updated 7 months ago
kangmintong / R-2-Guard
View on GitHub
[ICLR 2025] Code implementation of R^2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning
☆23Jul 8, 2024Updated 2 years ago
Tele-EVOL / TeleAI-Safety
View on GitHub
☆27Jan 5, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Yunhao-Feng / AgentHazard
View on GitHub
☆28Jun 13, 2026Updated last month
thu-coai / CDConv
View on GitHub
Data and codes for EMNLP 2022 paper "CDConv: A Benchmark for Contradiction Detection in Chinese Conversations"
☆13May 8, 2023Updated 3 years ago
THUSE-Course / course-index
View on GitHub
☆11Mar 3, 2026Updated 4 months ago
AntigoneRandy / PTYNet
View on GitHub
The official implementation of the paper "Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks".
☆19Apr 19, 2024Updated 2 years ago
davide97l / Pacman
View on GitHub
Implementation of many popular AI algorithms to play the game of Pacman such as Minimax, Expectimax and Greedy.
☆13Apr 12, 2020Updated 6 years ago
TrustAI-laboratory / Many-Shot-Jailbreaking-Demo
View on GitHub
Research on "Many-Shot Jailbreaking" in Large Language Models (LLMs). It unveils a novel technique capable of bypassing the safety mechan…
☆17Aug 6, 2024Updated last year
Shutdy / Shadowrocket-ADBlock-Rules
View on GitHub
提供多款 Shadowrocket 规则，带广告过滤功能。用于 iOS 未越狱设备选择性地自动翻墙。shadowrocket 配置：https://raw.githubusercontent.com/lhie1/Rules/master/Shadowrocket.conf…
☆14Oct 12, 2019Updated 6 years ago
osiriszjq / RobustPPE
View on GitHub
Robust Point Cloud Processing through Positional Embedding
☆14Sep 7, 2023Updated 2 years ago
xlr8harder / llm-compliance
View on GitHub
☆23Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
VimalWill / Vstream
View on GitHub
Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)
☆10Feb 2, 2024Updated 2 years ago
thunlp / LLM-generated-text-detection
View on GitHub
☆13Nov 7, 2023Updated 2 years ago
meowpass / FollowComplexInstruction
View on GitHub
Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…
☆55Jun 24, 2024Updated 2 years ago
deep-spin / OpenNMT-APE
View on GitHub
☆34Nov 24, 2020Updated 5 years ago
Pixie8888 / GFS-3DSeg_GWs
View on GitHub
Code for ICCV 2023 work "Generalized Few-Shot Point Cloud Segmentation Via Geometric Words"
☆14Sep 26, 2023Updated 2 years ago
Atom-101 / FourierFeat-Siren
View on GitHub
Pytorch implementation and comparison of Fourier Feature Networks and Sinusoidal Representation Networks
☆13Jun 27, 2020Updated 6 years ago
mistahenry / DNS-Cache-Poisoning-with-Scapy
View on GitHub
Project for a Computer Security class based on CSAW capture the flag challenges
☆13Mar 19, 2014Updated 12 years ago
ruizheng20 / gpo
View on GitHub
The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".
☆17Jun 20, 2024Updated 2 years ago
ZhentingWang / DIAGNOSIS
View on GitHub
☆23Apr 23, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
xunguangwang / SoK4JailbreakGuardrails
View on GitHub
[S&P 2026] SoK: Evaluating Jailbreak Guardrails for Large Language Models
☆44Dec 17, 2025Updated 7 months ago
ZrrSkywalker / MonoDETR-MV
View on GitHub
The multi-view version of MonoDETR on nuScenes dataset
☆21Nov 4, 2022Updated 3 years ago
T0hsakar1n / RAPID
View on GitHub
Source code and scripts for the paper "Is Difficulty Calibration All We Need? Towards More Practical Membership Inference Attacks"
☆20Dec 10, 2024Updated last year
THU-KEG / LRM-FactEval
View on GitHub
☆17Jun 25, 2025Updated last year
Dtc7w3PQ / Response-Attack
View on GitHub
Official implementation of “Response Attack: Exploiting Contextual Priming to Jailbreak Large Language Models” (AAAI 2026).
☆37Mar 22, 2026Updated 4 months ago
XiaoliChan / Xiaoli-Tools
View on GitHub
☆15Aug 1, 2023Updated 2 years ago
deepsec-top / deepsec
View on GitHub
中文网络安全运营领域开源语料库
☆59Aug 1, 2025Updated 11 months ago