RenatoGeh / advtokLinks
Adversarial Tokenization
☆34Updated 3 months ago
Alternatives and similar repositories for advtok
Users that are interested in advtok are comparing it to the libraries listed below
Sorting:
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆91Updated last year
- General research for Dreadnode☆25Updated last year
- [IJCAI 2024] Imperio is an LLM-powered backdoor attack. It allows the adversary to issue language-guided instructions to control the vict…☆43Updated 9 months ago
- Tree of Attacks (TAP) Jailbreaking Implementation☆115Updated last year
- ☆84Updated 3 months ago
- Source code of "TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification", ACL2024 (findings)☆13Updated last year
- A productionized greedy coordinate gradient (GCG) attack tool for large language models (LLMs)☆145Updated 11 months ago
- [EMNLP 2024] Holistic Automated Red Teaming for Large Language Models through Top-Down Test Case Generation and Multi-turn Interaction☆17Updated last year
- ☆98Updated last year
- Arxiv + Notion Sync☆20Updated 6 months ago
- Example agents for the Dreadnode platform☆19Updated this week
- ☆63Updated last week
- This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Lang…☆20Updated last year
- ☆17Updated last year
- using ML models for red teaming☆44Updated 2 years ago
- An interactive CLI application for interacting with authenticated Jupyter instances.☆55Updated 6 months ago
- Remote code execution in Power Platform connectors via JSON deserialization☆23Updated 2 years ago
- Nemesis agent for Mythic☆27Updated last year
- All things specific to LLM Red Teaming Generative AI☆29Updated last year
- Attack to induce LLMs within hallucinations☆162Updated last year
- Papers about red teaming LLMs and Multimodal models.☆154Updated 5 months ago
- Entra ID Password Protection Banned Password Lists☆16Updated last year
- Indirect Prompt Injection Methodology (IPIM) - A structured process which security professionals can use to find Indirect Prompt Injectio…☆13Updated 3 months ago
- Helper script for BloodHound to automatically add relationships between multiple accounts owned by the same individual☆14Updated 3 years ago
- An improvement and a different approach to Mockingjay Self-Injection.☆35Updated last year
- ☆93Updated last year
- ☆15Updated 2 years ago
- MLOps Attack Toolkit☆28Updated 2 months ago
- The jailbreak-evaluation is an easy-to-use Python package for language model jailbreak evaluation.☆27Updated last year
- Central repo for talks and presentations☆46Updated last year